source code commentator

Tue Sep 4 08:37:46 EDT 2001

Comprehension of source code is a precious commodity with both open
source and proprietary code. The immediate authors of a piece of code
may not have the time or inclination to document it in such a way that
others can understand it. What follows is a tool for commenting source
code in a separate file, without modifying the original source file.

This is good because it may not be practical or desirable to insert
new commentary in existing source files (for instance if the comments
are for private use, or the commentator lacks the influence to get
the comments put into the original code).

The bad thing is that, as is always true with code and comments when
maintained separately, the code will drift and evolve and the comments
will remain static. Commentary, once written, must be assumed to apply
only to a specific version of the source code. Most software projects
evolve slowly enough that a good clarification of the principles used
a few versions ago is still useful enlightenment with today's version.

#############################################

#!/usr/bin/env python

# scc.py -- a source code commentator program. Given a piece of source
# code and a suitably formatted text file of line-by-line commentary
# (think Lyons' Unix source code commentary), creates an HTML page
# with line numbers, and the commentary appearing correctly aligned in
# the left margin. I used a C source file as an example, but this
# should work with any programming language, or even literary text.

import sys, os, string

borderFlag = 0

def section(sourcefile, m, n):
     inf = os.popen("tail +%d < %s | head -%d" % (m, sourcefile, n - m))
     r = [ ]
     for x in inf.readlines():
         x = string.replace(x, "&", "&")
         x = string.replace(x, "<", "<")
         x = string.replace(x, ">", ">")
         r.append(("%5d " % m) + x[:-1])
         m = m + 1
     inf.close()
     return r

def flush(sourcefile, newlinenumber=None):
     global linenumber
     global cruft
     if newlinenumber == None:
         newlinenumber = numberOfLines + 1
     print "<tr><td valign=top>"
     for x in cruft:
         print x
     cruft = [ ]
     print "</td><td valign=top><pre>"
     for x in section(sourcefile, linenumber, newlinenumber):
         print x
     print "</pre></td></tr>"
     linenumber = newlinenumber

def makeHtml(sourcefile, commentlines):
     global cruft, numberOfLines, linenumber
     inf = os.popen("wc " + sourcefile)
     numberOfLines = string.atoi(string.split(inf.readline())[0])
     inf.close()
     linenumber = 1
     cruft = [ ]
     print '''<html>
     <title>Commentary on %(sourcefile)s</title>
     <body bgcolor="#FFFFFF">
     <h1>Commentary on %(sourcefile)s</h1>
     <table%(border)s>
     ''' % {"sourcefile": sourcefile, "border": borderFlag and " border" 
or ""}
     for L in lines:
         if L[:5] == "LINE ":
             flush(sourcefile, string.atoi(L[5:]))
         else:
             cruft.append(L)
     flush(sourcefile)
     print '''</table></body></html>'''

# Normally commentary like the following would go in a separate
# file. Then you would type:
#   ./scc.py -f commentaryfile > commentary.html
# or if you want borders in your HTML table to keep things even a
# little more visually clear, type
#   ./scc.py -t -f commentaryfile > commentary.html
# If you leave out "-f", commentary will be taken from standard input.
# HTML formatting tricks can be used in commentary, as shown below.

testLines = """Python-1.5.2/Objects/xxobject.c
Here is a comment that will appear at line 1. Not much is happening yet, 
just a
lot of boilerplate and attributions.
LINE 46
This defines a simple object. PyObject_HEAD is defined in
Include/object.h and provides fields for reference counting and an object
type. The only other thing here is x_attr, which will be a dictionary
of attributes to be added at run time.
LINE 60
This takes care of setting up an initial reference count, and doing the
correct malloc. Note forward reference to Xxtype defined
later on line 130.
LINE 73
When the reference count hits zero, we will come here and destroy the
object and whatever contents it has acquired. PyMem_DEL is a fancy
euphemism for free.
LINE 84
It's a doggone shame there isn't something more interesting happening here.
LINE 99
Now things start to get interesting. Assuming we've placed something in
the x_attr dictionary, this is where we retrieve it.
LINE 105
If we didn't find what we wanted in x_attr, try looking for a method
instead. The only method (see line 89) is demo.
LINE 114
We are trying to put something in the x_attr dictionary. We didn't
create a dictionary in newxxobject, so if we haven't had a reason
to create one yet, do so now.
LINE 120
setattr functions should delete dictionary entries when told to
assign the value to a NULL pointer.
LINE 127
This is what you actually expect setattr to do: put the thing in
the x_attr dictionary.
LINE 130
Types allow Python to assign standard methods (like repr,
dealloc, setattr, and getattr) and keep track of
names (here "xx") and object sizes. PyTypeObject and
PyObject_HEAD_INIT are defined in Include/object.h.
"""

if __name__ == "__main__":
     import getopt
     infile = sys.stdin
     optlist, args = getopt.getopt(sys.argv[1:], 'f:bt')
     for option, optarg in optlist:
         if option == '-f':
             infile = open(optarg)
         elif option == '-b':
             borderFlag = 1
         elif option == '-t':
             testFlag = 1
     if testFlag:
         lines = string.split(testLines, "\n")
     else:
         lines = map(lambda x: x[:-1], infile.readlines())
         infile.close()
     sourcefile = string.split(lines.pop(0))[0]
     makeHtml(sourcefile, lines)