[Pandas-dev] Managing the pandas firehose

Wed Mar 20 00:42:24 CET 2013

So from the developer page, here is the roadmap

   1. DONE numpy.datetime64 integration, scikits.timeseries codebase
   integration. Substantially improved time series functionality*.*
   2. Improved PyTables (HDF5) integration
   3. Tools for working with data sets that do not fit into memory
   4. Improved SQL / relational database tools
   5. Better statistical graphics using matplotlib
   6. Integration with D3.js <https://github.com/mikedewar/D3py>
   7. NDFrame data structure for arbitrarily high-dimensional labeled data
   8. Extend GroupBy functionality to regular ndarrays, record arrays
   9. Better support for NumPy dtype hierarchy without sacrificing usability
   10. *DONE Add a Factor data type (in R parlance)*
   11. Better support for integer NA values
   12. (0.10) Better memory usage and performance when reading very large
   CSV files

blue = done < 0.11
orange = 0.11
yellow = some support, more needed

IMHO
I think 8 is prob more trouble than its worth
out-of-core (3) is very important
5,6 pretty useful
11 a toss-up, depends on if pandas waits for numpy support or roll your own

any other items that should be on this list?

On Tue, Mar 19, 2013 at 7:18 PM, Chang She <changshe at gmail.com> wrote:

> Just to tack on to this email, I've started talking to some folks about
> applying for a grant to fund pandas development for the next year or so and
> wanted to get your thoughts on hiring someone to spend substantial time on
> pandas.
>
> There are several big questions here:
>
> 1. What are the main things that need done in the next year?
> 2. What exactly would that person be responsible for? Would he/she be
> full-time or part-time?
> 3. How much money would that take?
> 4. What organization would the money be funneled through (needs to be a
> non-profit)?
> 5. What metrics can we track over the next year or so to show whether the
> grant was successful?
> 6. How/who do we hire?
>
>
> Some of the stuff that Wes outlined in his email can definitely fall on
> this hypothetical person. Since we're all volunteers, having someone hired
> to make sure things don't fall through the cracks would give us a peace of
> mind and save us some stress.
>
> In any case, your thoughts would be appreciated (alternative funding ideas
> are also very welcome!)
>
>
>
> On Mar 19, 2013, at 4:00 PM, Wes McKinney <wesmckinn at gmail.com> wrote:
>
> > Hi all,
> >
> > Welcome to the new pandas developer list! I thought it would be good
> > to have a place for higher level discussions about the project and
> > other initiatives, so I made this.
> >
> > One note that I wanted to pass on as we move toward the 0.11 release
> > and going forward-- if I could get your help classifying and
> > categorizing incoming issues, that would be a big help of staying on
> > top of things. What does this mean?
> >
> > - Incoming issues: mark milestone as next release (bugs and other
> > "must fixes" or low hanging fruit), next next release at your
> > discretion. "Someday" otherwise. On GitHub you can see there are
> > 30-something issues that have no milestone-- in January there were
> > over 100 and I had a "milestone classification" binge. Would be great
> > to not be the only one =P
> > - Pull requests: also mark with a milestone please! This helps keep
> > track of what release pull requests were a part of later on.
> > - Label accordingly-- you all have been doing a good job with this.
> >
> > Code review and pull requests:
> > - For one or two commits that aren't likely to be controversial (e.g.
> > Jeff has been doing a lot of little doc additions), I don't mind if
> > you push directly to master. If you think having someone else (doesn't
> > need to be me necessarily) sign off would be good, then leave until
> > that happens.
> > - I don't mind if you use the green button-- I waffle between regular
> > merges and cherry-picks when the number of commits is small.
> >
> > My main concern with ongoing development is making sure that things
> > don't fall through the cracks and that bugs that come into the issue
> > tracker get promptly classified. Any other thoughts?
> >
> > At some point we'll have to think about release management-- I have
> > been carrying that torch since pandas 0.1, but at some point maybe
> > someone else will do it. Part of it relies on having access to a
> > fully-equipped Windows VM with 32 and 64 bit versions across all
> > Python versions-- I have a virtualbox image that should get hosted
> > someplace that is not a physical box in my apartment at some point.
> >
> > - Wes
> > _______________________________________________
> > Pandas-dev mailing list
> > Pandas-dev at python.org
> > http://mail.python.org/mailman/listinfo/pandas-dev
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> http://mail.python.org/mailman/listinfo/pandas-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20130319/456209e1/attachment-0001.html>