[Numpy-discussion] dtype repr change?

Mark Wiebe mwwiebe at gmail.com
Fri Jul 29 09:49:35 EDT 2011


On Thu, Jul 28, 2011 at 3:09 PM, Nathaniel Smith <njs at pobox.com> wrote:

> I have a different question about this than the rest of the thread. I'm
> confused at why there isn't a programmatic way to create a datetime dtype,
> other than by going through this special string-based mini-language. I guess
> I generally think of string-based dtype descriptors as being a legacy thing
> necessary for compatibility, but probably better to avoid in new code, now
> that we have nice python ways to describe dtypes with scalar types and such.
> Probably that's a minority opinion, but even putting it aside: it certainly
> isn't the case that we can describe arbitrary dtypes using strings right now
> - think of record types and so on. And even restricting ourselves to atomic
> styles, I'm skeptical about this claim that we'll be able to use strings for
> everything in the future, too. My pet possible future dtype is one for
> categorical data, which would be parametrized by the set of possible
> categories; I don't relish the idea of making up some ad hoc syntax for
> specifying such lists within the dtype mini-language.
>
> So is the plan actually to promote strings as the canonical way of
> describing dtypes? Aside from the question of what repr does, shouldn't
> there actually be some sort of syntax like dtype=np.datetime64("D")
> available as a working option?
>

I've thought about having something like this in addition to the string
format, but haven't worked it through. Calling np.datetime64("D") is
creating a datetime64 scalar. What would more closely match the string
syntax is np.datetime64["D"], which would require overloading __getitem__ in
the type object, something I haven't tried. Since this is something that
could be just as easily added later, I was treating it as pretty low on the
long datetime TODO list.

I'm personally more in favour of there being a canonical string
representation of each dtype, similar to the way Python repr(obj) is
intended to be able to reconstruct the object where possible. It would be
nice to come up with an unambiguous string format for  struct dtypes, that
is definitely something I see as missing. Being able to construct the dtype
without using the string is very good, though, too.

-Mark


> - Nathaniel
> On Jul 27, 2011 10:55 AM, "Mark Wiebe" <mwwiebe at gmail.com> wrote:
> > This was the most consistent way to deal with the parameterized dtype in
> the
> > repr, making it more future-proof at the same time. It was producing
> reprs
> > like "array(['2011-01-01'], dtype=datetime64[D])", which is clearly
> wrong,
> > and putting quotes around it makes it work in general for all possible
> > dtypes, present and future.
> >
> > -Mark
> >
> > On Wed, Jul 27, 2011 at 12:50 PM, Matthew Brett <matthew.brett at gmail.com
> >wrote:
> >
> >> Hi,
> >>
> >> I see that (current trunk):
> >>
> >> In [9]: np.ones((1,), dtype=bool)
> >> Out[9]: array([ True], dtype='bool')
> >>
> >> - whereas (1.5.1):
> >>
> >> In [2]: np.ones((1,), dtype=bool)
> >> Out[2]: array([ True], dtype=bool)
> >>
> >> That is breaking quite a few doctests. What is the reason for the
> >> change? Something to do with more planned dtypes?
> >>
> >> Thanks a lot,
> >>
> >> Matthew
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110729/b0c64c33/attachment.html>


More information about the NumPy-Discussion mailing list