[Numpy-discussion] Re: Simplifying array()
Rory Yorke
ryorke at telkomsa.net
Fri Jan 14 07:33:21 EST 2005
[Todd]
> I agree with this myself. Does anyone care if they will no longer be
> able to construct an array from a file or buffer object using array()
> rather than fromfile() or NumArray(), respectively? Is a deprecation
> process necessary to remove them?
There seems to be a majority opinion in favour of deprecation, though
at least Florian uses the sequence-as-a-buffer feature.
[Colin]
> I would suggest deprecation on the way to removal. For the
> newcomer, who is not yet "clued up" some advice on the instantiation
> of NumArray would help. Currently,
The deprecation warning could include a pointer to NumArray or
fromfile, as appropriate. I think some of the Python stdlib
deprecations (doctest?) do exactly this. The NumArray docs do need to
be fixed, though.
[Colin]
> Rory leaves in type and typecode. It would be good to eliminate
> this apparent overlap. Why not deprecate and then drop type? As a
> compromise, either could be accepted as a NumArray.__init__
> argument, since it is easy to distinguish between them.
[Perry]
> Tim is right about this. The rationale was that typecode is
> inaccurate since types are no longer represented by letter codes
> (one can still use them for backward compatibility).
Also, the type keyword matches the NumArray type method. It does have
the downside of clashing with the type builtin, of course.
> It would be good to clarify the acceptable content of a sequence. A
I think this is quite important, though perhaps not too difficult. I
think any sequence, or nested sequences should be accepted, provided
that they are "conformally sized" (for lack of a better phrase) and
that the innermost sequences contain number types. I'll try to word
this more precisely for the docs.
Note that a NumArray is a sequence, in the sense that it has
__getitem__ and __len__ methods, and is index from 0 upwards.
Strings are also sequences, and Alexander made a comment to the patch
that array() should handle sequences of strings. Consider Numeric's
behaviour:
>>> array(["abc",[1,2,3]])
array([[97, 98, 99],
[ 1, 2, 3]])
I think this needs to be handled in fromlist, which, I think, handles
fairly general sequences, but not strings.
Note that this leads to a different interpretation of array(["abcd"])
and array("abcd")
According to the above, array(["abcd"] should return
array([[97,98,99,100]]) and, since plain strings go straight to
fromstring, array("abcd") should return array([1684234849]) (probably
dependent on endianess, what Long is, etc.). Is this acceptable?
[Colin]
>Is the function asarray redundant?
[Tim]
> No, the copy=False parameter is redundant ;) Well as a pair they are
I'm not sure I follow Tim's argument, but asarray is not redundant for
a different reason: it returns any NDArray arguments without calling
array. generic.ravel calls numarraycore.asarray, and so ravel()ing
RecArrays, or some other non-NumArray NDArray requires asarray to
remain as it is.
I'm not sure if this setup is desirable, but I decided not to change
too many things at once.
[Colin]
>I suggest that the copy parameter be of the BoolType. This
>probably has no practical impact but it is consistent with current
>Python usage and makes it clear that this is a Yes/No parameter,
>rather than specifying a number of copies.
This makes sense; as Todd noted, we shouldn't rely on it being a bool,
but having False as the default value is clearer.
Cheers,
Rory
More information about the NumPy-Discussion
mailing list