[Numpy-discussion] dtype repr change?

Matthew Brett matthew.brett at gmail.com
Wed Jul 27 17:32:06 EDT 2011


Hi,

On Wed, Jul 27, 2011 at 1:12 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Wed, Jul 27, 2011 at 3:09 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Wed, Jul 27, 2011 at 14:47, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> > On Wed, Jul 27, 2011 at 2:44 PM, Matthew Brett <matthew.brett at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Wed, Jul 27, 2011 at 12:25 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> >> > On Wed, Jul 27, 2011 at 1:01 PM, Matthew Brett
>> >> > <matthew.brett at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Wed, Jul 27, 2011 at 6:54 PM, Mark Wiebe <mwwiebe at gmail.com>
>> >> >> wrote:
>> >> >> > This was the most consistent way to deal with the parameterized
>> >> >> > dtype
>> >> >> > in
>> >> >> > the
>> >> >> > repr, making it more future-proof at the same time. It was
>> >> >> > producing
>> >> >> > reprs
>> >> >> > like "array(['2011-01-01'], dtype=datetime64[D])", which is
>> >> >> > clearly
>> >> >> > wrong,
>> >> >> > and putting quotes around it makes it work in general for all
>> >> >> > possible
>> >> >> > dtypes, present and future.
>> >> >>
>> >> >> I don't know about you, but I find maintaining doctests across
>> >> >> versions changes rather tricky.  For our projects, doctests are
>> >> >> important as part of the automated tests.  At the moment this means
>> >> >> that many doctests will break between 1.5.1 and 2.0.  What do you
>> >> >> think the best way round this problem?
>> >> >
>> >> > I'm not sure what the best approach is. I think the primary use of
>> >> > doctests
>> >> > should be to validate that the documentation matches the
>> >> > implementation,
>> >> > and
>> >> > anything confirming aspects of a software system should be regular
>> >> > tests.
>> >> >  In NumPy, there are platform-dependent differences in 32 vs 64 bit
>> >> > and
>> >> > big
>> >> > vs little endian, so the part of the system that changed already
>> >> > couldn't be
>> >> > relied on consistently. I prefer systems where the code output in the
>> >> > documentation is generated as part of the documentation build process
>> >> > instead of being included in the documentation source files.
>> >>
>> >> Would it be fair to summarize your reply as 'just deal with it'?
>> >
>> > I'm not sure what else I can do to help you, since I think this aspect
>> > of
>> > the system should be subject to arbitrary improvement. My recommendation
>> > is
>> > in general not to use doctests as if they were regular tests. I'd rather
>> > not
>> > back out the improvements to repr, if that's what you're suggesting
>> > should
>> > happen. Do you have any other ideas?
>>
>> In general, I tend to agree that doctests are not always appropriate.
>> They tend to "overtest" and express things that the tester did not
>> intend. It's just the nature of doctests that you have to accept if
>> you want to use them. In this case, the tester wanted to test that the
>> contents of the array were particular values and that it was a boolean
>> array. Instead, it tested the precise bytes of the repr of the array.
>> The repr of ndarrays are not a stable API, and we don't make
>> guarantees about the precise details of its behavior from version to
>> version. doctests work better to test simpler types and methods that
>> do not have such complicated reprs. Yes, even as part of an automated
>> test suite for functionality, not just to ensure the compliance of
>> documentation examples.
>>
>> That said, you could only quote the dtypes that require the extra
>> [syntax] and leave the current, simpler dtypes alone. That's a
>> pragmatic compromise to the reality of the situation, which is that
>> people do have extensive doctest suites already around, without
>> removing your ability to innovate with the representations of the new
>> dtypes.
>
> That sounds reasonable to me, and I'm happy to review pull requests from
> anyone who has time to do this change.

Forgive me, but this seems almost ostentatiously unhelpful.

I understand you have little sympathy for the problem, but, just as a
social courtesy, some pointers as to where to look would have been
useful.

See you,

Matthew



More information about the NumPy-Discussion mailing list