[Numpy-discussion] rank-0 arrays

Fri Sep 13 00:20:03 EDT 2002

On Thu, 12 Sep 2002, Perry Greenfield wrote:

> If we return rank-0 arrays, what should repr return for rank-0
> arrays. My initial impression is that the following is highly
> undesirable for a interactive session, but maybe it is just me:
> 
> >>> x = arange(10)
> >>> x[2]
> array(2)
>
> We, of course, could arrange __repr__ to return "2" instead,
> in other words print the simple scalar for all cases of rank-0
> arrays. This would yield the expected output in the above
> example. Nevertheless, isn't it violating the intent of repr?
> Are there other examples where Python uses repr in a similar,
> misleading manner? But perhaps most feel that returning array(2)
> is perfectly acceptable and won't annoy users. I am curious
> about what people think about this.

I think it would be confusing if the result of repr would be `2' and not
`array(2)' because 2 and array(2) are not equivalent in all usages
but it should be clear from repr results as a first way to learn more
about the objects.

For example, if using array(2) as an index in Python native objects, then
TypeError is raised (as expected). In interactive session the quickest way
to check the type of a variable is just type its name and press enter:

>>> i
array(2)

Now, if repr(array(2)) returns '2', then one firstly assumes that
`i' is an integer. However, this would be very confusing if one sees
>>> some_list[i]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: sequence index must be integer
>>> i
2

So, I think that repr(array(2)) should return 'array(2)'. And users can
always change this behaviour locally by using sys.displayhook. Though, I
would recommend using tools like ipython for interactive sessions.

Btw, note that during an interactive session it would be *seemingly*
desired if also repr(string) would return str(string). For example,
when viewing documentation in interactive sessions. Wouldn't it be nice to
have

>>> sys.displayhook.__doc__
displayhook(object) -> None

Print an object to sys.stdout and also save it in __builtin__._

>>>

instead of the current behaviour:

>>> sys.displayhook.__doc__
'displayhook(object) -> None\n\nPrint an object to sys.stdout and also
save it in __builtin__._\n'
>>>

> The second issue is an efficiency one. Currently numarray uses
> Python objects for arrays. If we return rank-0 arrays for
> single item indexing, then some naive uses of larger arrays
> as sequences may lead to an enormous number of array objects
> to be created. True, there will be equivalent means of doing
> the same operation that won't result in massive object creations
> (such as specifically converting an array to a list, which would 
> be done much faster). Is this a serious problem?

Could array.__getitem_ and __getslice__ detect if their argument is
an array and skip using Python objects when iterating over indices?
If this is technically possible then it is not a good reason to drop
returning rank-0 arrays. The actual implementation may come later, though.

> If people still want rank-0 arrays, what should repr do?

Always return 'array(...)'.

You can also ask from python-dev for advice if numarray is considered to
be included to Python library in future. I am sure that repr issue will be
brought up if repr==str for 0-rank arrays.

Pearu