[Pandas-dev] EA Naming Conventions

Joris Van den Bossche jorisvandenbossche at gmail.com
Wed Feb 8 10:04:23 EST 2023


On Thu, 26 Jan 2023 at 23:30, Brock Mendel <jbrockmendel at gmail.com> wrote:

> For historical reasons we've built up an EA namespace without much
> internal logic in terms of what is public/private.  While this isn't _that_
> big of a deal, it'd be nice to make this more coherent.  I see two useful
> options:
>

In my opinion (and recollection), at the start when ExtensionArrays were
introduced, the rule was quite clear: *everything* on the base class is
considered as public for developers (EA implementors can (or need to)
override those), and then whether the actual name is public vs private
(i.e. leading underscore or not) depends on whether it should be public for
end users (not implementors).

And we use documentation / comments to indicate to developers (EA
implementors) which parts are required to implement or are optional to
implement.


>
> 1) Use the traditional "an underscore means this should only be called
> from within self".  Very few methods on the base class satisfy that
> characteristic, including the constructor _from_sequence.  One benefit of
> moving to this is it would make "official" that we shouldn't be using
> _values_for_foo from outside EA methods.
>

We don't want to make all those "private" functions for EAs to implement
public to end-users, so I don't think this is an option.
Also, there *are* some valid cases to call the _values_for_.. methods
outside of other EA methods, so this is not a general rule.


> 2) Use underscores to signal to 3rd party authors whether or not there
> exists a working (not necessarily performant) implementation on the base
> class.  In this scenario authors would _have_ to implement private methods,
> while implementing public methods would be optional.
>
> That would make some of the currently private (and not useful for
end-users) methods public, and some public methods private (if we do that
for existing methods, and not as a rule for new methods). But what is the
main goal you want to achieve here? That it is clearer for EA implementors
what they need to implement? (currently we use AbstractMethodError for that
which seems already clear to me, and we have base tests that you can
inherit that should cover those basic things you need to implement)

Joris


> Thoughts?
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230208/8209fabe/attachment.html>


More information about the Pandas-dev mailing list