Proposal: Syntax for attribute initialisation in __init__ methods

Sam Ezeh sam.z.ezeh at gmail.com
Fri Apr 15 07:19:21 EDT 2022


Elsewhere, the idea of supporting new syntax to automatically initialise
attributes provided as arguments to __init__ methods was raised.

Very often, __init__ functions will take arguments only to assign them as
attributes of self. This proposal would remove the need to additionally
write `self.argument = argument` when doing this.

I'm specifically looking at statements of the form `self.argument =
argument`.

I ran a query on the source code of the top 20 most downloaded PyPi
packages (source:
https://github.com/dignissimus/presearch/blob/master/queries/attribute_initialisation.py)
and found the following statistics. I found that 19% of classes that define
__init__ and have at least one argument that isn't `self` assign all of
those non-self arguments as attributes with the same name in the function
definition.
I also found that 33% of __init__ functions that have more than one
non-self argument assign at least one of their arguments as attributes with
the same name, that 27% of __init__ functions that have at least two
non-self arguments assign at least 2 of them as attributes with the same
name and that 28% of __init__ functions that had 3 or more non-self
arguments assigned at least 3 of their arguments as attributes with the
same name.

```
[sam at samtop]: ~/Documents/git/presearch>$ presearch -f
queries/attribute_initialisation.py sources/
Running queries...
Out of 1526 classes defining __init__, there were 290 (19.0%) classes whose
__init__ functions assigned all non-self arguments as attributes.
Out of 1526 __init__ functions with at least one non-self argument, there
were 497 (32.57%) __init__ functions that assigned one or more non-self
arguments as attributes.
Out of 834 __init__ functions with at least two non-self arguments, there
were 221 (26.5%) __init__ functions that assigned two or more non-self
arguments as attributes.
Out of 490 __init__ functions with at least three non-self arguments, there
were 139 (28.37%) __init__ functions that assigned three or more non-self
arguments as attributes.
[sam at samtop]: ~/Documents/git/presearch>$

```

With the new syntax, the following snippet taking from the pyyaml source
code (pyyaml is the 12th most downloaded package this month on PyPi)

```
def __init__(self, default_style=None, default_flow_style=False,
sort_keys=True):
    self.default_style = default_style
    self.sort_keys = sort_keys
    self.default_flow_style = default_flow_style
    self.represented_objects = {}
    self.object_keeper = []
    self.alias_key = None
```

Can be re-written as follows

```
def __init__(self, @default_style=None, @default_flow_style=False,
@sort_keys=True):
    self.represented_objects = {}
    self.object_keeper = []
    self.alias_key = None
```

And from numpy, the following code

```
def __init__(self, mbfunc, fillx=0, filly=0):
        """
        abfunc(fillx, filly) must be defined.

        abfunc(x, filly) = x for all x to enable reduce.

        """
        super().__init__(mbfunc)
        self.fillx = fillx
        self.filly = filly
        ufunc_domain[mbfunc] = None
        ufunc_fills[mbfunc] = (fillx, filly)
```

Can be written like this

```
def __init__(self, mbfunc, @fillx=0, @filly=0):
        """
        abfunc(fillx, filly) must be defined.

        abfunc(x, filly) = x for all x to enable reduce.

        """
        super().__init__(mbfunc)
        ufunc_domain[mbfunc] = None
        ufunc_fills[mbfunc] = (fillx, filly)
```

Some related implementations are attrs, dataclasses and the use of a
decorator. And there's potentially a point to be raised that the results
from the first query indicate that the @dataclasse decorator is not being
used enough. One advantage this proposal offers is control over the
arguments that the __init__ function takes.

A downside to using a decorator is that it might become difficult to accept
arguments that don't need to be assigned to anything.

I gave the example of the following code (unlike the above, this is not
taken from existing python source code). In this example, a decorator can't
assign all of the arguments to attributes or else it would produce code
that does something different.

```
class ExampleClass:
    def __init__(self, example, argument, word):
        self.example = example
        self.argument = argument
        do_something(word)
```

In response, the following was given

```
class ExampleClass:
    @make_attr("example")
    @make_attr("argument")
    def __init__(self, example, argument, word):
        do_something(word)
```

And this could potentially be written like this

```
class ExampleClass:
    @make_attr("example", "argument")
    def __init__(self, example, argument, word):
        do_something(word)
```

However, having to rewrite the argument name might defeat the purpose of
having a decorator.

As an idea, I thought about the case of when an __init__ method only
contains arguments that will become attributes. From the cryptography
library, there's the following example.

```
class _DeprecatedValue:
    def __init__(self, value: object, message: str, warning_class):
        self.value = value
        self.message = message
        self.warning_class = warning_class
```

With the new syntax, this would become the following

```
class _DeprecatedValue:
    def __init__(self, @value: object, @message: str, @warning_class):
        pass
```

The empty __init__ method seems unnecessary so perhaps it could be reduced
further to the following

```
class _DeprecatedValue:
    @value: object
    @message: str
    @warning_class
```

With regards to implementation details, there are questions about the order
of execution, right now it seems to make sense to me that attributes should
be assigned to prevent them from being overridden by any calls to super()
for code similar to that in the numpy example. Another question could be
what happens if the syntax is used outside of an __init__ method. Right
now, it seems to me that this could be ok and the code could run the same
as it would inside the __init__ method. There is also the question of what
happens if the syntax is used inside a function that isn't defined in a
class, and perhaps this usage should be rejected. As another edge case,
there's the question of what would happen if this syntax is used on
functions labelled with the @staticmethod decorator. Type hinting will also
have to be dealt with.

With all of this in mind, I'd like to hear what other people think about
this proposal.

To perform the queries I created a tool called presearch which others might
find useful. I found it quite fun to make however it's only been in
existence for 2 days so there currently isn't any documentation and it's
lacking in several areas.
The source code can be found here: https://github.com/dignissimus/presearch

Kind Regards,
Sam Ezeh


More information about the Python-list mailing list