whitespace , comment stripper, and EOL converter

M.E.Farmer mefjr75 at hotmail.com
Sat Apr 16 14:28:15 EDT 2005


Glad you are making progress ;)

>I give you a brief example of the xref output (taken from your >code,
>also if the line numbers don't match, because I modified >your code,
>not beeing interested in eof's other than Linux).

What happens when you try to analyze a script from a diffrent os ? It
usually looks like a skewed mess, that is why I have added EOL
conversion so it is painless for you to convert to your eol of choice.
The code I posted consist of a class and a Main function.
The class has three methods.
 __init__ is called by Python when you create an instance of the class
Stripper.  All __init__ does here is just set a class variable self.raw
.
format is called explicitly with a few arguments to start the
tokenizer.
__call__ is special it is not easy to grasp how this even works.. at
first.
In Python when you treat an instance like a function, Python invokes
the __call__method of that instance if present and if it is callable().
example:
         try:
            tokenize.tokenize(text.readline, self)
        except tokenize.TokenError, ex:
            traceback.print_exc()
The snippet above is from the Stripper class.
Notice that tokenize.tokenize is being feed a reference to self ( if
this code is running self is an instance of Stripper ).
tokenize.tokenize is really a hidden loop.
Each token generated is sent to self as five parts toktype, toktext,
(startrow,startcol), (endrow,endcol), and line. Self is callable and
has a __call__  method so tokenize sends really sends the five part
info to __call__ for every token.
If this was obvious then ignore it ;)

M.E.Farmer




More information about the Python-list mailing list