[Numpy-discussion] Re: Simplifying array()

Rory Yorke ryorke at telkomsa.net
Fri Jan 14 07:33:21 EST 2005


[Todd]
> I agree with this myself.  Does anyone care if they will no longer be
> able to construct an array from a file or buffer object using array()
> rather than fromfile() or NumArray(), respectively?  Is a deprecation
> process necessary to remove them? 

There seems to be a majority opinion in favour of deprecation, though
at least Florian uses the sequence-as-a-buffer feature.

[Colin]
> I would suggest deprecation on the way to removal.  For the
> newcomer, who is not yet "clued up" some advice on the instantiation
> of NumArray would help.  Currently,

The deprecation warning could include a pointer to NumArray or
fromfile, as appropriate. I think some of the Python stdlib
deprecations (doctest?) do exactly this. The NumArray docs do need to
be fixed, though.

[Colin]
> Rory leaves in type and typecode.  It would be good to eliminate
> this apparent overlap.  Why not deprecate and then drop type?  As a
> compromise, either could be accepted as a NumArray.__init__
> argument, since it is easy to distinguish between them.

[Perry]
> Tim is right about this. The rationale was that typecode is
> inaccurate since types are no longer represented by letter codes
> (one can still use them for backward compatibility).

Also, the type keyword matches the NumArray type method. It does have
the downside of clashing with the type builtin, of course.

> It would be good to clarify the acceptable content of a sequence.  A

I think this is quite important, though perhaps not too difficult. I
think any sequence, or nested sequences should be accepted, provided
that they are "conformally sized" (for lack of a better phrase) and
that the innermost sequences contain number types. I'll try to word
this more precisely for the docs.

Note that a NumArray is a sequence, in the sense that it has
__getitem__ and __len__ methods, and is index from 0 upwards.

Strings are also sequences, and Alexander made a comment to the patch
that array() should handle sequences of strings. Consider Numeric's
behaviour:

>>> array(["abc",[1,2,3]])
array([[97, 98, 99],
       [ 1,  2,  3]])

I think this needs to be handled in fromlist, which, I think, handles
fairly general sequences, but not strings.

Note that this leads to a different interpretation of array(["abcd"])
and array("abcd")

According to the above, array(["abcd"] should return
array([[97,98,99,100]]) and, since plain strings go straight to
fromstring, array("abcd") should return array([1684234849]) (probably
dependent on endianess, what Long is, etc.). Is this acceptable?

[Colin]
>Is the function asarray redundant?

[Tim]
> No, the copy=False parameter is redundant ;) Well as a pair they are

I'm not sure I follow Tim's argument, but asarray is not redundant for
a different reason: it returns any NDArray arguments without calling
array. generic.ravel calls numarraycore.asarray, and so ravel()ing
RecArrays, or some other non-NumArray NDArray requires asarray to
remain as it is.

I'm not sure if this setup is desirable, but I decided not to change
too many things at once.

[Colin]
>I suggest that the copy parameter be of the BoolType.  This
>probably has no practical impact but it is consistent with current
>Python usage and makes it clear that this is a Yes/No parameter,
>rather than specifying a number of copies.

This makes sense; as Todd noted, we shouldn't rely on it being a bool,
but having False as the default value is clearer.

Cheers,

Rory




More information about the NumPy-Discussion mailing list