does lack of type declarations make Python unsafe?

Tue Jun 17 21:10:02 EDT 2003

Duncan Booth <duncan at NOSPAMrcp.co.uk> writes:

> beliavsky at aol.com wrote in 
> news:3064b51d.0306170436.2b31f9e at posting.google.com:
>
>> subroutine solve_lsq_svd_rmse(aa,bb,xx,rmse,ierr)
>> ! solve a set of least-squares linear equations and compute the rmse
>> real     , intent(in)   :: aa(:,:) ! matrix of independent variables
>> real     , intent(in)   :: bb(:)   ! vector of dependent variable
>> real     , intent(out)  :: xx(:)   ! solution of least-squares problem
>> real     , intent(out)  :: rmse    ! rmse of regression
>> integer  , intent(out)  :: ierr    ! error flag
>> 
>> In Python, to ensure that aa is a 2-D array and that bb and xx are 1-D
>> arrays, you need to write some checking code, which will not be as
>> clear as the above declarations IMO. Even if you do this, the errors
>> will be caught at run time, not compile time.
>> 
>> (Regardless of the language, you need to check that the # of rows in
>> aa equals the # of elements in bb. This usually must be done at run
>> time.)
>
> You don't need to check that xx is a 1-D array since it is a result, so 
> your function just generates a 1-D array and returns it. Likewise you don't 
> need to check the type of rmse, and I really hope that in Python ierr would 
> disappear entirely in favour of an exception.
>
> So you are down to the question of whether aa is a 2-D array, and whether 
> bb is a 1-D array of the correct size. Rather than checking the type of aa, 
> you could just try using it as a 2-D array. You will be pretty hard pushed 
> to accidentally pass in something of the wrong type that produces a result 
> rather than an exception.
>
> I think you also need to distinguish not just between compile time and 
> runtime, but between compile time, test time and runtime. Compile and test 
> times happen (I hope) every few minutes while you are developing the 
> program. If you are passing an incompatible type into a function, this 
> should be caught at test time, i.e. within about 2 minutes maximum of you 
> writing the call to the function.

Hi Duncan,

Well, first of all, the kinds of things you need to test in numerical
linear algebra often take a lot longer than 2 minutes.  Systems meant
to solve large problems often need large tests in order to exercise
all the emergent behaviors.

Regardless, I fear you have conveniently missed the point here.  What
prefixed the text you quoted was:

> I find that static typing makes a big difference for two things:
> 
>   1. Readability.  It really helps to have names introduced with a
>      type or a type constraint which expresses what kind of thing they
>      are.  This is especially true when I am coming back to code after
>      a long time or reading someone else's work.  Attaching that
>      information to the name directly is odious, though, and leads to
>      abominations like hungarian notation.

This was not about code correctness but readability/maintainability.
Yes, you can do the same kind of thing with comments, but:

    a. Comments go out-of-date, while static checks don't.

    b. Documenting generic type constraints is very difficult to do
       concisely in English, so people generally don't write them.  Go
       through any module from a large body of Python code and tell
       me what percentage of functions have rigorous comments
       describing their type requirements.

I have this experience all the time.  Just today I was trying to fix
some problems with ReStructuredText and PDF writing via ReportLab.  I
am not deeply familiar with either codebase.  I was having problems
with a function and had a devil of a time trying to figure out what
kinds of things were passed as its parameter named 'node'.  I added
some printing code (you shouldn't have to modify 3rd party source just
to analyze it) but polymorphism defeated that - it was all kinds of
types.  Eventually I had to guess.

I'm not saying that Python isn't a wonderful language -- it is.  It's
great to have the flexibility of fully dynamic typing available.  All
the same, I'm not going to pretend that static typing doesn't have
real advantages.  I seriously believe that it's better most of the
time, because it helps catch mistakes earlier, makes even simple code
easier to maintain, and makes complex code easier to write.  I would
trade some convenience for these other strengths.  Although it's very
popular around here to say that research has shown static typing
doesn't help, I've never seen that research, and my personal
experience contradicts the claim anyway.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com