Determining types of variables and function params by parsing the source code
Stefan Schwarzer
sschwarzer at sschwarzer.net
Fri Aug 23 16:50:08 EDT 2002
Hi Markus
Markus Meyer wrote:
> [Motivation for determining types]
>
> Variables (objects) are instantiated in Python by using one of the
> following mechanisms (am I missing something?):
> 1) Explicit instantion of basic types (numbers, strings, ...)
do you mean simply writing a literal, e. g. 1 ?
> 2) Construction by the object constructor
As in ...
class X: pass
X() # new object
?
> 3) Assignment of another variable
> 4) Assignment of the return value of a function
> 5) Assignment of the result of an operator action (f.e. list
> operators)
Perhaps I don't understand you correctly but assignments per se don't create
new objects but build new references to existing objects.
There is also the "new" module though I don't know what it does behind the
scenes. There are also other types of factories in Python, e. g. the match
function in the "re" module which generates a match object.
> What I intend to do is to "trace forward" these assignments to obtain
> as much type information as possible. Imagine the following code:
>
> "this is hello.py"
> def print13(s):
> t = s[1:3]
> print t
>
> s = 'hello'
> print13(s)
>
> The following informaton could be extracted by a sophisticated parser:
> - print13 takes an argument of type string (because it is called with
> argument s, which is a string because of the previous assignment)
On the other hand, no one disallows this:
def print13(s):
t = s[1:3]
print t
s = 'hello'
print13(s)
print13( [1, 2, 3, 4] )
In the second invocation, s in print13 is a list.
> - print takes an argument of type string (because it is called with
> argument t, which was constructed by using the [] operator on s, which
> is an argument of the function print13, which takes an argument of
> type string)
print13 _takes_ arguments of _any_ type. Some of them will cause an exception
to be raised, some not.
> Of course I'm aware that there is inheritation, and some function can
> take arguments of more than one type. One would have to take this into
> account.
How? :-)
> Further extensions would be possible. F.e., if you have some context
> where var.myspecialfunc() is called, and MySpecialClass is the only
> class declaring a public function with name myspecialfunc(), you can
> safely assume, that var is of type MySpecialClass.
>
> So, is this total crap? Has anyone else tried this? If not, would you
> recommend writing (1) another parser for Python, (2) use the parser
> module, (3) do something completely different?
I think you can get some type information, but it will rather be a good guess than
a safe bet. Also consider dynamic source/code generation via exec, eval and
execfile. You can't safely determine the type by static analysis, even without
exec etc., e. g.
-----
# Module a.py
_type = int
def return_value():
return _type()
print return_type() # gives 0 (int)
-----
Analyzing module "a" might make you think that return_type returns an int ...
-----
# Module b.py
import a
print return_type() # gives 0 (int)
a._type = str
print return_type() # gives '' (str)
-----
... but it can return any type.
Stefan
More information about the Python-list
mailing list