[Pandas-dev] Proposal for consistent, clear copy/view semantics in pandas with Copy-on-Write

Stephan Hoyer shoyer at gmail.com
Fri Jul 16 12:58:17 EDT 2021


On Fri, Jul 16, 2021 at 9:04 AM Brock Mendel <jbrockmendel at gmail.com> wrote:

> I do not like the fact that nothing can ever be "just a view" with these
> semantics, including series[::-1], frame[col], frame[:]. Users reasonably
> expect numpy semantics for these.
>
> We should revisit the alternative "clear/simple rules" approach that is
> "indexing on columns always gives a view" (
> https://github.com/pandas-dev/pandas/pull/33597). This is simpler to
> explain/grok, simpler to implement, and not dependent on BlockManager vs
> ArrayManager.
>

I don't know if it is worth the trouble for complex multi-column
selections, but I do see the appeal here.

A simpler variant would be to make indexing out a single Series from a
DataFrame return a view, with everything else doing copy on write. Then the
existing pattern df.column_one[:] = ... would still work.


>
> On Fri, Jul 16, 2021 at 5:26 AM Joris Van den Bossche <
> jorisvandenbossche at gmail.com> wrote:
>
>>
>>
>> On Mon, 12 Jul 2021 at 00:58, Joris Van den Bossche <
>> jorisvandenbossche at gmail.com> wrote:
>>
>>> Short summary of the proposal:
>>>
>>>    1. The result of *any* indexing operation (subsetting a DataFrame or
>>>    Series in any way) or any method returning a new DataFrame, always *behaves
>>>    as if it were* a copy in terms of user API.
>>>
>>>  To explicitly call out the column-as-Series case (since this is a
>> typical case that right now *always* is a view): "any" indexing
>> operation thus also included accessing a DataFrame column as a Series (or
>> slicing a Series).
>>
>> So something like s = df["col"] and then mutating s will no longer
>> update df. Similarly for series_subset = series[1:5], mutating
>> series_subset will no longer update s.
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20210716/52239462/attachment-0001.html>


More information about the Pandas-dev mailing list