[Python-ideas] Python multi-dimensional array constructor

Todd toddrjen at gmail.com
Wed Oct 19 15:53:43 EDT 2016


On Wed, Oct 19, 2016 at 3:24 PM, Thomas Nyberg <tomuxiong at gmx.com> wrote:

> Personally I like the way that numpy does it now better (even for
> multidimensional arrays). Being able to index into the different sub
> dimension using just [] iteratively matches naturally with the data
> structure itself in my mind. This may also just be my fear of change
> though...
>
>
I agree, that is one of the reasons this is still using "[ ]".  You can
think of the "|" as more of a dimension delimiter.

Also keep in mind that tuples and dicts still use [ ] for indexing even
though their constructor doesn't use [ ].


> Here is an example of how it would be used for a 1D array:
>>
>> a = [| 0, 1, 2 |]
>>
>> Compared to the current approach:
>>
>> a = np.ndarray([0, 1, 2])
>>
>
> What would the syntax do if you don't have numpy installed? Is the syntax
> tied to numpy or could other libraries make use of it?
>

That would depend on the implementation, which is another issue I hesitate
to even discuss up because it is a huge can of worms.

The most plausible implementation in my mind would be for projects like
IPython, Spyder, Sage, or some array-oriented language that compiles to
Python to have their own hooks that would replace this sort syntax with
"np.ndarray" behind-the-scenes.

A less likely scenario that occurred to me would be for the Python
interpreter to provide some sort of hook to allow a class to be registered
as the handler for this syntax.  So someone could register numpy, dask,
dynd, or whatever they wanted as the handler.  If nothing was registered
using the syntax would raise an exception (perhaps NameError or some new
exception).  With equivalent "(| |)" and "{| |}" syntax you could
conceivably register three packages.  I figured perhaps "[| |]" would be
used for your primary array class (which currently would pretty much always
be a ndarray), and there could be a more dict-like "{| |}" syntax that
could be used for pandas or xarray, leaving "(| |)" for a more
special-purpose library of your choosing.  But that would be a convention,
it would be entirely up to the user.  Behind-the-scenes, this syntax would
be converted to nested tuples or lists (or maybe dicts for "{| |}") and
passed to the constructor or a special classmethod for the registered class
to handle however it sees fit.

There are all sorts of questions and corner cases for this hook approach
though.  Could people change the registered handler after it is set?  At
what points during script executation would setting the handler be allowed,
at any point or only near the beginning?  Would "{| |}" use the same syntax
or some sort of dict-like syntax?  If dict-like, would there be separate
dict-like and set-like syntaxes, resulting in four handlers?  Or would
list-like and dict-like syntaxes be allowed in all cases, and handlers
would need to deal with getting lists/tuples or dicts (even if handling was
simply raising a TypeError)?  Does the hook provide lists or tuples?  Does
the data get fed directly to the constructor or to a special class method?
If the former, how do classes identify themselves as being able to act as
handlers, or should users be allowed to register any class?  There are so
many questions I don't have good answers to I don't feel comfortable
proposing this sort of hook approach as something that should actually be
implemented.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161019/e44b2dc5/attachment.html>


More information about the Python-ideas mailing list