does lack of type declarations make Python unsafe?

Wed Jun 18 22:55:03 EDT 2003

Alex Martelli <aleax at aleax.it> writes:

> David Abrahams wrote:
>
>> Alex Martelli <aleax at aleax.it> writes:
>> 
>>> But this has little to do with the need of 'type declarations'.  I
>>> suspect that a statically typed language would also be better off
>>> without them, relying on type inferencing instead, a la Haskell (and
>>> Haskell's typeclasses to keep the inferencing as wide as feasible),
>>> for example.  But I have no research to back this up;-).
>> 
>> I don't have any first-hand experience, but the experience of friends
>> of mine who have used Haskell is that it can be exceedingly difficult
>> to locate the source of a type error when it does occur, since the
>> inference engine may propagate the "wrong" type back much further than
>> the source of the error.
>
> Surely the compiler should easily be able to annotate the sources with
> the information it has inferred, including, in particular, type
> information.

Oh, yes, IIUC it does: what you get is a long nasty thing which
resembles more than anything a C++ template instantiation backtrace.

> Thus it cannot possibly be any harder to identify the error point than
> if the same type declarations had been laboriously, redundantly written
> out by hand -- except, at worst, for a slight omission in the tool of a
> feature which would be easily provided.

No, if you "laboriously, redundantly" write out the type declarations
by hand you get precise feedback about the location of a type
mismatch.

>> Furthermore, if you do everything by inference you lose the
>> explanatory power of type declarations.
>
> I think I know where you're coming from, having quite a past as a
> static-type-checking enthusiast myself, but I think you overrate the
> "explanatory power of type declarations".
>
> What I want to be able to do in my sources is assert a set of facts
> about "and at this point I know X holds".  

Much of what's in X is captured by type information.

> Sometimes X might perhaps be of the form "a is of type B", but
> that's really a very rare and specific case.  Much more often it
> will be "container c is non-empty", "sequence d is sorted", "either
> x<y or pred(a[z]) for some x>=z>=y", and so on, and so forth.  

Those assertions are *chock full* of type information:

      the type of c is a container

      the type of d is a sequence

      the type of a is indexable

      the types of x and y have a particular ordering relationship

and so on and so forth.

Type declarations don't have to identify concrete types; they can
identify concepts, constraints, and relationships.

> Type declarations would have extraordinary "explanatory power" if
> and only if "a is of type B" was extraordinarily more important than
> the other kinds of assertions, and it just isn't -- even though, by
> squinting just right, you may end up seeing it that way by a sort of
> "Stockholm Syndrome" applied to the constraints your compiler forces
> upon you.

Suggesting that I've grown to love my shackles is a little bit
insulting.  I have done significant programming in Python where I
didn't have static typing; I've gotten over my initial reactions to
the lack of static checks and grown comfortable with the language.
Purely dynamic typing works fine for a while.  I have seen real
problems develop in my code that static type checking would have
prevented.

It's not an illusion that static types help (a lot) with certain
things.  The type information is at least half of what you've written
in each of those assertions.  I use runtime assertions, too, though
often I use type invariants to constrain the state of things --
because it makes reasoning about my code *much* easier.  An
ultra-simple case: it's great to be able to use an unsigned type and
not have to think about asserting x >= 0 everywhere.

> Types are about implementation

No, they're about a relationship between an interface and semantics.
Most people leave out the semantics part when thinking about them,
though.

> and one should "program to an interface, not to an implementation"
> -- therefore, "a is of type B" is rarely what one SHOULD be focusing
> on.  

In a modern type system, "a is of type B" mostly expresses an
interface for a and says nothing about implementation per se.  It does
say something about the effects of using that interface, though that
part is harder to formalize.

However, with a nod to "practicality beats purity:"

     People don't usually think in these abstract terms about most of
     their code, and rigorously documenting code in terms of interface
     requirements is really difficult, so most people never do it.
     It's a poor investment anyway because *most* (not all) Python
     code is never used generically: common interfaces for polymorphic
     behavior are generally captured in base classes and a great deal
     of code is just operating on concrete types anyway.

     The result is that there are usually no expression of interface
     requirements at all in a function's interface/docs, where in the
     *vast* majority of cases a simple (non-generic) type declaration
     would've done the trick.  [Without the expression of interface
     requirements, the possibility to use the function generically is
     lost, for all intents and purposes]

So, while I buy "program to an interface" in theory, in practice it
is only appropriate in a small fraction of code.

> Of course, some languages blur the important distinction
> between a type and a typeclass (or, a class and an interface, in
> Java terms -- C++ just doesn't distinguish them by different
> concepts

Nor does most of the type theory I've seen.

> so, if you think in C++, _seeing_ the crucial distinction may be
> hard;-).

I know what typeclasses and variants are all about.

> "e provides such-and-such an interface" IS more often interesting, but,
> except in Eiffel, the language-supplied concept of "interface" is too
> weak for the interest to be sustained -- it's little more than the sheer
> "signature" that you can generally infer easily.  E.g.:
>
> my procedure receiving argument x
>
>     assert "x satisfies an interface that provides a method Foo which
>             is callable without arguments"
>
>     x.Foo()
>
> the ``assert'' (which might just as well be spelled "x satisfies
> interface Fooable", or, in languages unable to distinguish "being
> of a type" from "satisfying an interface", "x points to a Fooable")
> is ridiculously redundant, the worse sort of boilerplate.  

Only if you think that only syntax (and not semantics) count.  It's
not just important that you can "Foo()" x, but that Fooing it means
what you think it does.

> Many, _many_ type declarations are just like that, particularly if
> one follows a nice programming style of many short
> functions/methods.  At least in C++ you may often express the
> equivalent of
>
> "just call x.Foo()!"
>
> as
>
> template <type T>
> void myprocedure(T& x)
> {
>     x.Foo();
> }

In most cases it's evil to do this without a rigorous concept
definition (type constraint) for T in the documentation.  Pretty much
all principled template code (other than special cases like the lambda
library which are really just for forwarding syntax) does this, and
it's generally acknowledged as a weakness in C++ that there's no way
to express the type constraints in code.

> where you're basically having to spend a substantial amount of
> "semantics-free boilerplate" to tell the compiler and the reader
> "x is of some type T" 

Where are you claiming the expression of the type of x is in the code
above?  I don't see it.

> (surprise, surprise; I'm sure this has huge explanatory power,
> doesn't it -- otherwise the assumption would have been that x was of
> no type at all...?)  

This kind of sneering only makes me doubt the strength of your
argument even more.  I know you're a smart guy; I ask you to treat my
position with the same respect with which I treat yours.

> while letting them both shrewdly infer that type T, whatever it
> might be, had better provide a method Foo that is callable without
> arguments (e.g. just the same as the Python case, natch).

Only if you consider the implementation of myprocedure to be its
documentation.

> You do get the "error diagnostics 2 seconds earlier" (while compiling
> in preparation to running unit-tests, rather than while actually
> running the unit-tests) if and when you somewhere erroneously call
> myprocedure with an argument that *doesn't* provide the method Foo
> with the required signature.  But, how can it surprise you if Robert
> Martin claims (and you've quoted me quoting him as if I was the
> original source of the assertion, in earlier posts) 

Hey, sorry, I just let Gnus do its job.  If the quote attributions
were messed up then someone messed them up before me.

> that this just isn't an issue...?  

It doesn't surprise me in the least that some people in the Python
community claim that their way is unambiguously superior.  It's been
going on for years.  I wanted to believe that, too.  My experience
contradicts that idea, unfortunately.

> If the compilation takes 3 seconds, then getting the error
> diagnostics 2 seconds earlier is still a loss of time, not a gain,
> compared to just running the tests w/o any compilation;-)...

Comprehensive test suites can't always run in a few seconds (the same
applies to compilations, but I digress).  In a lot of the work I've
done, testing takes substantially longer, unavoidably.  A great deal
of this work is exactly the sort of thing I like to use Python for, in
fact (but not because of the lack of type declarations).  If
compilation is reasonably fast and I have been reasonably
conscientious about my type invariants, though, I *can* detect many
errors with a static type system.

But more importantly, I can come back to my code months later and
still figure out what's going on, or work with someone else's code
without losing my way.  Isn't that why we're all using Python instead
of Perl?

> I do, at some level, want a language where I CAN (*not* MUST) make
> assertions about what I know to be true at certain points:
> 1. to help the reader in a way that won't go out of date (the assert
>    statement does that pretty well in most cases)
> 2. to get the compiler to to extra checks & debugging for me (ditto)
> 3. to let the compiler in optimizing mode deduce/infer whatever it
>    wants from the assertions and optimize accordingly (and assert is
>    no use here, at least as currently present in C, C++, Python)

Those are all the same things I want, and for the same reasons.  What
are we arguing about again?

> But even if and when I get such a language I strongly doubt most of
> my assertions will be similar to "type declarations" anyway...

Oh, there it is.  Well, if the language has a weak notion of type,
then you're probably right.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com