[issue32513] dataclasses: make it easier to use user-supplied special methods

Eric V. Smith report at bugs.python.org
Sun Jan 21 21:41:21 EST 2018


Eric V. Smith <eric at trueblade.com> added the comment:

Here is my proposal for making it easier for the user to supply dunder methods that dataclasses would otherwise automatically add to the class.

For all of these cases, when I talk about init=, repr=, eq=, order=, hash=, or frozen=, I'm referring to the parameters to the dataclass decorator.

When checking if a dunder method already exists, I mean check for an entry in the class's __dict__. I never check to see if a member is defined in a base class.

Let's get the easy ones out of the way first.

__init__
If __init__ exists or init=False, don't generate __init__.

__repr__
If __repr__ exists or repr=False, don't generate __repr__.

__setattr__
__delattr__
If frozen=True and either of these methods exist, raise a TypeError. These methods are needed for the "frozen-ness" of the class to be implemented, and if they're already set then that behavior can't be enforced.

__eq__
If __eq__ exists or eq=False, don't generate __eq__.

And now the harder ones:

__ne__
I propose to never generate a __ne__ method. Python will call __eq__ and negate the result for you. If you feel you really need a non-matching __ne__, then you can write it yourself without interference from dataclasses.

__lt__
__le__
__gt__
__ge__
I propose to treat each of these individually, but for each of them using the value of the order= parameter. So:
If __lt__ exists or order=False, don't generate __lt__.
If __le__ exists or order=False, don't generate __le__.
If __gt__ exists or order=False, don't generate __gt__.
If __ge__ exists or order=False, don't generate __ge__.
If for some crazy reason you want to define some of these but not others, then set order=False and write your desired methods.

__hash__
Whether dataclasses might generate __hash__ depends on the values of hash=, eq=, and frozen=. Note that the default value of hash= is None. See below for an explanation.

If hash=False, never generate __hash__. If hash=True, generate __hash__ unless it already exists.

If hash=None (the default), then use this table to decide whether and how to generate __hash__:
eq=?	frozen=?	__hash__
False	False		do not generate __hash__
False	True		do not generate __hash__
True	False		set __hash__ to None unless it already exists
True	True		generate __hash__ unless it already exists
                         and is None

Note that it's almost always a bad idea to specify hash=True or hash=False. The action based on the above table (where hash=None, the default), is usually the correct behavior.

One special case to recognize is if the class defines a __eq__. In this case, Python will assign __hash__=None before the dataclass decorator is called. The decorator cannot distinguish between these two cases (except possibly by using the order of __dict__ keys, but that seems overly fragile):

@dataclass
class A:
    def __eq__(self, other): pass

@dataclass
class B:
    def __eq__(self, other): pass
    __hash__ = None

This is the source of the last line in the above table: for a dataclass where eq=True, frozen=True, and hash=None, if __hash__ is None it will still be overwritten. The assumption is that this is what the user wants, but it's a tricky corner case. It also occurs if setting hash=True and defining __eq__. Again, it's not expected to come up in normal usage.

----------
keywords:  -patch

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32513>
_______________________________________


More information about the Python-bugs-list mailing list