[Numpy-discussion] numpy 1.7.0 release?

Tue Dec 6 17:13:43 EST 2011

On Tue, Dec 6, 2011 at 4:11 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Dec 5, 2011 at 8:43 PM, Ralf Gommers <ralf.gommers at googlemail.com>
> wrote:
>>
>> Hi all,
>>
>> It's been a little over 6 months since the release of 1.6.0 and the NA
>> debate has quieted down, so I'd like to ask your opinion on the timing of
>> 1.7.0. It looks to me like we have a healthy amount of bug fixes and small
>> improvements, plus three larger chucks of work:
>>
>> - datetime
>> - NA
>> - Bento support
>>
>> My impression is that both datetime and NA are releasable, but should be
>> labeled "tech preview" or something similar, because they may still see
>> significant changes. Please correct me if I'm wrong.
>>
>> There's still some maintenance work to do and pull requests to merge, but
>> a beta release by Christmas should be feasible.
>
>
> To be a bit more detailed here, these are the most significant pull requests
> / patches that I think can be merged with a limited amount of work:
> meshgrid enhancements: http://projects.scipy.org/numpy/ticket/966
> sample_from function: https://github.com/numpy/numpy/pull/151
> loadtable function: https://github.com/numpy/numpy/pull/143
>
> Other maintenance things:
> - un-deprecate putmask
> - clean up causes of "DType strings 'O4' and 'O8' are deprecated..."
> - fix failing einsum and polyfit tests
> - update release notes
>
> Cheers,
> Ralf
>
>
>> What do you all think?
>>
>>
>> Cheers,
>> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

This isn't the place for this discussion but we should start talking
about building a *high performance* flat file loading solution with
good column type inference and sensible defaults, etc. It's clear that
loadtable is aiming for highest compatibility-- for example I can read
a 2800x30 file in < 50 ms with the read_table / read_csv functions I
wrote myself recent in Cython (compared with loadtable taking > 1s as
quoted in the pull request), but I don't handle European decimal
formats and lots of other sources of unruliness. I personally don't
believe in sacrificing an order of magnitude of performance in the 90%
case for the 10% case-- so maybe it makes sense to have two functions
around: a superfast custom CSV reader for well-behaved data, and a
slower, but highly flexible, function like loadtable to fall back on.
I think R has two functions read.csv and read.csv2, where read.csv2 is
capable of dealing with things like European decimal format.

- Wes