[Python-ideas] Trial balloon: adding variable type declarations in support of PEP 484

Guido van Rossum guido at python.org
Wed Aug 3 12:11:42 EDT 2016


On Wed, Aug 3, 2016 at 8:24 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 3 August 2016 at 10:48, Alvaro Caceres via Python-ideas
> <python-ideas at python.org> wrote:
>> The criticism I would make about allowing variables without assignments like
>>
>>   a: float
>>
>> is that it makes my mental model of a variable a little bit more complicated
>> than it is currently. If I see "a" again a few lines below, it can either be
>> pointing to some object or be un-initialized. Maybe the benefits are worth
>> it, but I don't really see it, and I wanted to point out this "cost".
>
> This concern rings true for me as well - "I'm going to be defining a
> variable named 'a' later and it will be a float" isn't a concept
> Python has had before. I *have* that concept in my mental model of
> C/C++, but trying to activate for Python has my brain going "Wut?
> No.".

Have you annotated a large code base yet? This half of the proposal
comes from over six months of experience annotating large amounts of
code (Dropbox code and mypy itself). We commonly see situations where
a variable is assigned on each branch of an if/elif/etc. structure. If
you need to annotate that variable, mypy currently requires that you
put the annotation on the first assignment to the variable, which is
in the first branch. It would be much cleaner if you could declare the
variable before the first `if`. But picking a good initializer is
tricky, especially if you have a type that does not include None.

As an illustration, I found this code:
https://github.com/python/mypy/blob/master/mypy/checkexpr.py#L1152

        if op == 'not':
            self.check_usable_type(operand_type, e)
            result = self.chk.bool_type()  # type: Type
        elif op == '-':
            method_type = self.analyze_external_member_access('__neg__',
                                                              operand_type, e)
            result, method_type = self.check_call(method_type, [], [], e)
            e.method_type = method_type
        elif op == '+':
            method_type = self.analyze_external_member_access('__pos__',
                                                              operand_type, e)
            result, method_type = self.check_call(method_type, [], [], e)
            e.method_type = method_type
        else:
            assert op == '~', "unhandled unary operator"
            method_type = self.analyze_external_member_access('__invert__',
                                                              operand_type, e)
            result, method_type = self.check_call(method_type, [], [], e)
            e.method_type = method_type
        return result

Look at the annotation of `result` in the if-block. We need an
annotation because the first functions used to assign it returns a
subclasses of `Type`, and the type inference engine will assume the
variable's type is that of the first assignment. Be that as it may,
given that we need the annotation, I think the code would be clearer
if we could set the type *before* the `if` block. But we really don't
want to set a value, and in particular we don't want to set it to
None, since (assuming strict None-checking) None is not a valid value
for this type -- we don't want the type to be `Optional[Type]`. IOW I
want to be able to write this code as

        result: Type
        if op == 'not':
            self.check_usable_type(operand_type, e)
            result = self.chk.bool_type()
        elif op == '-':
        # etc.

> I'd be much happier if we made initialisation mandatory, so the above
> would need to be written as either:
>
>     a: float = 0.0 # Or other suitable default value
>
> or:
>
>     a: Optional[float] = None
>
> The nebulous concept (and runtime loophole) where you can see:
>
>     class Example:
>         a: float
>         ...
>
> but still have Example().a throw AttributeError would also be gone.

That's an entirely different issue though -- PEP 484 doesn't concern
itself with whether variables are always initialized (though it's
often easy for a type checker to check that). If we wrote that using
`__init__` we could still have such a bug:

class Example:
    def __init__(self, n: int) -> None:
        for i in range(n):
            self.a = 0.0  # type: float

But the syntax used for declaring types is not implicated in this bug.

> (Presumably this approach would also simplify typechecking inside
> __new__ and __init__ implementations, as the attribute will reliably
> be defined the moment the instance is created, even if it hasn't been
> set to an appropriate value yet)

But, again, real problems arise when the type of an *initialized*
instance must always be some data structure (and not None), but you
can't come up with a reasonable default initializer that has the
proper type.

Regarding the question of whether it's better to declare the types of
instance variables in `__init__` (or `__new__`) or at the class level:
for historical reasons, mypy uses both idioms in different places, and
when exploring the code I've found it much more helpful to see the
types declared in the class rather than in `__init__`. Compare for
yourself:

https://github.com/python/mypy/blob/master/mypy/build.py#L84 (puts the
types in `__init__`)

https://github.com/python/mypy/blob/master/mypy/build.py#L976 (puts
the types in the class)

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-ideas mailing list