Determining types of variables and function params by parsing the source code

Stefan Schwarzer sschwarzer at sschwarzer.net
Fri Aug 23 16:50:08 EDT 2002


Hi Markus

Markus Meyer wrote:
 > [Motivation for determining types]
> 
> Variables (objects) are instantiated in Python by using one of the
> following mechanisms (am I missing something?):
> 1) Explicit instantion of basic types (numbers, strings, ...)

do you mean simply writing a literal, e. g. 1 ?

> 2) Construction by the object constructor

As in ...

class X: pass
X()  # new object

?

> 3) Assignment of another variable
> 4) Assignment of the return value of a function
> 5) Assignment of the result of an operator action (f.e. list
> operators)

Perhaps I don't understand you correctly but assignments per se don't create
new objects but build new references to existing objects.

There is also the "new" module though I don't know what it does behind the
scenes. There are also other types of factories in Python, e. g. the match
function in the "re" module which generates a match object.

> What I intend to do is to "trace forward" these assignments to obtain
> as much type information as possible. Imagine the following code:
> 
> "this is hello.py"
> def print13(s):
>    t = s[1:3]
>    print t
> 
> s = 'hello'
> print13(s)
> 
> The following informaton could be extracted by a sophisticated parser:
> - print13 takes an argument of type string (because it is called with
> argument s, which is a string because of the previous assignment)

On the other hand, no one disallows this:

def print13(s):
    t = s[1:3]
    print t

s = 'hello'
print13(s)

print13( [1, 2, 3, 4] )

In the second invocation, s in print13 is a list.

> - print takes an argument of type string (because it is called with
> argument t, which was constructed by using the [] operator on s, which
> is an argument of the function print13, which takes an argument of
> type string)

print13 _takes_ arguments of _any_ type. Some of them will cause an exception
to be raised, some not.

> Of course I'm aware that there is inheritation, and some function can
> take arguments of more than one type. One would have to take this into
> account.

How? :-)

> Further extensions would be possible. F.e., if you have some context
> where var.myspecialfunc() is called, and MySpecialClass is the only
> class declaring a public function with name myspecialfunc(), you can
> safely assume, that var is of type MySpecialClass.
> 
> So, is this total crap? Has anyone else tried this? If not, would you
> recommend writing (1) another parser for Python, (2) use the parser
> module, (3) do something completely different?

I think you can get some type information, but it will rather be a good guess than
a safe bet. Also consider dynamic source/code generation via exec, eval and
execfile. You can't safely determine the type by static analysis, even without
exec etc., e. g.

-----
# Module a.py

_type = int

def return_value():
     return _type()

print return_type()  # gives 0 (int)
-----

Analyzing module "a" might make you think that return_type returns an int ...

-----
# Module b.py

import a

print return_type()  # gives 0 (int)
a._type = str
print return_type()  # gives '' (str)
-----

... but it can return any type.

Stefan




More information about the Python-list mailing list