[Pandas-dev] Developing libpandas as a separate codebase + Componentization + Deferred "pandas expressions"

Wed May 3 19:42:34 EDT 2017

On Wed, May 3, 2017 at 4:09 PM, Wes McKinney <wesmckinn at gmail.com> wrote:

> *TOPIC ONE:* I have been thinking about how to proceed with pandas 2.0
> development in a sane way with the following goals:
>
> ...
>

> Changing the internals of Series and DataFrame is going to be a difficult
> process (and frankly, it would be easier to build a brand new project, but
> I am not going to advocate for that). But I think one way we can make
> things easier is by developing "libpandas" and its Python bindings as a
> separate codebase.
>

I'm strongly supportive of a separate "libpandas", but do consider going
further and making "pandas2" a separate thing.

If users have to switch from "import pandas" to "import pandas2", it would
give us the freedom to do some important API clean-up/simplification (e.g.,
for indexing and other pandas methods that don't have well defined type
signatures). Also, we will have the option to leave old stuff behind rather
immediately porting everything to pandas2 with complete backport support,
which is rather ambitious.

> *TOPIC THREE:* I think we should start developing a "deferred pandas API"
> that is designed and directly developed by the pandas developer community.
>
...
>
> * "True" schemas (we'll have to work around pandas 0.x warts with implicit
> casts, etc.)
>
> * Immutable data structures / no mutation outside "amend" operations that
> change values by returning new objects
>
> * Less index-related stuff in this API (perhaps this is controversial, we
> shall see)
>

This all sounds fantastic, but could you clarify a little bit what you mean
by schemas?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20170503/94253177/attachment.html>