[Numpy-discussion] ndarray.T2 for 2D transpose

Thu Apr 7 11:42:04 EDT 2016

On Thu, Apr 7, 2016 at 11:35 AM, <josef.pktd at gmail.com> wrote:

> On Thu, Apr 7, 2016 at 11:13 AM, Todd <toddrjen at gmail.com> wrote:
> > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Wed, Apr 6, 2016 at 10:43 AM, Todd <toddrjen at gmail.com> wrote:
> >> >
> >> > My intention was to make linear algebra operations easier in numpy.
> >> > With
> >> > the @ operator available, it is now very easy to do basic linear
> algebra
> >> > on
> >> > arrays without needing the matrix class.  But getting an array into a
> >> > state
> >> > where you can use the @ operator effectively is currently pretty
> verbose
> >> > and
> >> > confusing.  I was trying to find a way to make the @ operator more
> >> > useful.
> >>
> >> Can you elaborate on what you're doing that you find verbose and
> >> confusing, maybe paste an example? I've never had any trouble like
> >> this doing linear algebra with @ or dot (which have similar semantics
> >> for 1d arrays), which is probably just because I've had different use
> >> cases, but it's much easier to talk about these things with a concrete
> >> example in front of us to put everyone on the same page.
> >>
> >
> > Let's say you want to do a simple matrix multiplication example.  You
> create
> > two example arrays like so:
> >
> >    a = np.arange(20)
> >    b = np.arange(10, 50, 10)
> >
> > Now you want to do
> >
> >     a.T @ b
> >
> > First you need to turn a into a 2D array.  I can think of 10 ways to do
> this
> > off the top of my head, and there may be more:
> >
> >     1a) a[:, None]
> >     1b) a[None]
> >     1c) a[None, :]
> >     2a) a.shape = (1, -1)
> >     2b) a.shape = (-1, 1)
> >     3a) a.reshape(1, -1)
> >     3b) a.reshape(-1, 1)
> >     4a) np.reshape(a, (1, -1))
> >     4b) np.reshape(a, (-1, 1))
> >     5) np.atleast_2d(a)
> >
> > 5 is pretty clear, and will work fine with any number of dimensions, but
> is
> > also long to type out when trying to do a simple example.  The different
> > variants of 1, 2, 3, and 4, however, will only work with 1D arrays
> (making
> > them less useful for functions), are not immediately obvious to me what
> the
> > result will be (I always need to try it to make sure the result is what I
> > expect), and are easy to get mixed up in my opinion.  They also require
> > people keep a mental list of lots of ways to do what should be a very
> simple
> > task.
> >
> > Basically, my argument here is the same as the argument from pep465 for
> the
> > inclusion of the @ operator:
> >
> https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers
> >
> > "A large proportion of scientific code is written by people who are
> experts
> > in their domain, but are not experts in programming. And there are many
> > university courses run each year with titles like "Data analysis for
> social
> > scientists" which assume no programming background, and teach some
> > combination of mathematical techniques, introduction to programming, and
> the
> > use of programming to implement these mathematical techniques, all
> within a
> > 10-15 week period. These courses are more and more often being taught in
> > Python rather than special-purpose languages like R or Matlab.
> >
> > For these kinds of users, whose programming knowledge is fragile, the
> > existence of a transparent mapping between formulas and code often means
> the
> > difference between succeeding and failing to write that code at all."
>
> This doesn't work because of the ambiguity between column and row vector.
>
> In most cases 1d vectors in statistics/econometrics are column
> vectors. Sometime it takes me a long time to figure out whether an
> author uses row or column vector for transpose.
>
> i.e. I often need x.T dot y   which works for 1d and 2d to produce
> inner product.
> but the outer product would require most of the time a column vector
> so it's defined as x dot x.T.
>
> I think keeping around explicitly 2d arrays if necessary is less error
> prone and confusing.
>
> But I wouldn't mind a shortcut for atleast_2d   (although more often I
> need atleast_2dcol to translate formulas)
>
>
At least from what I have seen, in all cases in numpy where a 1D array is
treated as a 2D array, it is always treated as a row vector, the examples I
can think of being atleast_2d, hstack, vstack, and dstack. So using this
convention would be in line with how it is used elsewhere in numpy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160407/8b3e7672/attachment.html>