A "bug" in inspect.findsource

Joonas Paalasmaa joonas at olen.to
Thu Jan 10 15:55:23 EST 2002


There is a minor bug in inspect.findsource
The starting line number of an examined class is searched by using
regular expressions. The problem of using regexps is that they can't
handle multiline strings correctly, whereas tokenizer can.

Here's an example where original inspect.findsource returns an incorrect
answer, whereas modified inspect2.findsource, that uses tokenizer
returns a right answer.

Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import inspect, inspect2, inspect_test
>>> inspect.findsource(inspect_test.DummyClass)
(["'''\n", 'You should remember, that\n', 'class DummyClass can only
be\n',
'used for dummy purposes.\n', "'''\n", '\n', 'class DummyClass:\n',
'   pass\n'], 2 )
>>> inspect2.findsource(inspect_test.DummyClass)
(["'''\n", 'You should remember, that\n', 'class DummyClass can only
be\n',
'used for dummy purposes.\n', "'''\n", '\n', 'class DummyClass:\n',
'   pass\n'], 7 )

The biggest drawback of using the tokenizer is, that is it very slow,
but it is
not a good thing either, that a library function returns an incorrect
answer.
Perhaps tokenizer should be implemented in C to achieve reasonable
speed.

## inspect_test.py begin ##
'''
You should remember, that
class DummyClass can only be
used for dummy purposes.
'''

class DummyClass:
    pass
## inspect_test.py end ##



## original inspect.findsource begin ###
def findsource(object):
    """Return the entire source file and starting line number for an
object.

    The argument may be a module, class, method, function, traceback,
frame,
    or code object.  The source code is returned as a list of all the
lines
    in the file and the line number indexes a line in that list.  An
IOError
    is raised if the source code cannot be retrieved."""
    try:
        file = open(getsourcefile(object))
    except (TypeError, IOError):
        raise IOError, 'could not get source code'
    lines = file.readlines()
    file.close()

    if ismodule(object):
        return lines, 0

    if isclass(object):
        name = object.__name__
        pat = re.compile(r'^\s*class\s*' + name + r'\b')
        for i in range(len(lines)):
            if pat.match(lines[i]): return lines, i
        else: raise IOError, 'could not find class definition'

    if ismethod(object):
        object = object.im_func
    if isfunction(object):
        object = object.func_code
    if istraceback(object):
        object = object.tb_frame
    if isframe(object):
        object = object.f_code
    if iscode(object):
        if not hasattr(object, 'co_firstlineno'):
            raise IOError, 'could not find function definition'
        lnum = object.co_firstlineno - 1
        pat = re.compile(r'^\s*def\s')
        while lnum > 0:
            if pat.match(lines[lnum]): break
            lnum = lnum - 1
        return lines, lnum
## original inspect.findsource end ###



## modified inspect2.findsource begin ##
def findsource(object):
    """Return the entire source file and starting line number for an
object.

    The argument may be a module, class, method, function, traceback,
frame,
    or code object.  The source code is returned as a list of all the
lines
    in the file and the line number indexes a line in that list.  An
IOError
    is raised if the source code cannot be retrieved."""
    try:
        file = open(getsourcefile(object))
    except (TypeError, IOError):
        raise IOError, 'could not get source code'
    lines = file.readlines()
    file.close()

    if ismodule(object):
        return lines, 0

    if isclass(object):
        name = object.__name__
        # XXX that StringIO trick is really ugly
        tokens =
tokenize.generate_tokens(StringIO.StringIO("".join(lines)).readline)
        while 1:
            try:
                if tokens.next()[1] == "class":
                    nexttoken = tokens.next()
                    if nexttoken[1] == name:
                        return lines, nexttoken[2][0]
            except StopIteration:
                raise IOError, 'could not find class definition'

    if ismethod(object):
        object = object.im_func
    if isfunction(object):
        object = object.func_code
    if istraceback(object):
        object = object.tb_frame
    if isframe(object):
        object = object.f_code
    if iscode(object):
        if not hasattr(object, 'co_firstlineno'):
            raise IOError, 'could not find function definition'
        lnum = object.co_firstlineno - 1
        pat = re.compile(r'^\s*def\s')
        while lnum > 0:
            if pat.match(lines[lnum]): break
            lnum = lnum - 1
        return lines, lnum
## modified inspect2.findsource end ##



More information about the Python-list mailing list