Strip white spaces from source

qwweeeit at yahoo.it qwweeeit at yahoo.it
Tue May 10 11:05:18 EDT 2005


Hi Richie,
I did not post my solution because I did not want to "pollute" the
pythonic way of programming.
Young programmers, don't follow me!
I hate (because I am not able to use them...) classes and regular
expressions.
Instead I like lists, try/except (to limit or better eliminate
debugging) and os.system + shell programming (I use Linux).
The problem of stripping white spaces from python source lines could be
easily (not for me...) solved by RE.

Instead I choosed the hard way:
Imagine you have a lot of strings representing python source lines (in
my case I have almost 30000 lines).
Let's call a generic line "sLine" (with or without the white spaces
representing indentation).
To strip the un-necessary spaces you need to identify the operands.

Thanks to the advice of Alex Martelli, there is a parsing method based
on tokenize module, to achieve this:

import tokenize, cStringIO
try:
.   for x in
tokenize.generate_tokens(cStringIO.StringIO(sLine).readline):
.   .   if x[0]==50:
.   .   .    sLine=sLine.replace(' '+x[1],x[1])
.   .   .    sLine=sLine.replace(x[1]+' ',x[1])
except tokenize.TokenError:
.   pass

- x[0] is the 1st element of the x tuple, and 50 is the code for
OPERAND.
(For those who want to experiment on the x tuple, you can print it
merely by a
"print str(x)". You obtain as many tuples as the elements present in
the line).
- x[1] (the 2nd element of the x tuple) is the Operand itself.

The try/except is one of my bad habits:
the program fails if the line is a multiline.
Ask Alex... I haven't gone deeper.

At the end you have sLine with white spaces stripped...
There is yet a mistake...: this method  strip white spaces also inside
strings.
(I don't care...).

A last word of caution: I haven't tested this extract from my
routine...

This small script is part of a bigger program: a cross-reference tool,
but don't ask me for that...

Bye.




More information about the Python-list mailing list