Strip white spaces from source
qwweeeit at yahoo.it
qwweeeit at yahoo.it
Tue May 10 11:05:18 EDT 2005
Hi Richie,
I did not post my solution because I did not want to "pollute" the
pythonic way of programming.
Young programmers, don't follow me!
I hate (because I am not able to use them...) classes and regular
expressions.
Instead I like lists, try/except (to limit or better eliminate
debugging) and os.system + shell programming (I use Linux).
The problem of stripping white spaces from python source lines could be
easily (not for me...) solved by RE.
Instead I choosed the hard way:
Imagine you have a lot of strings representing python source lines (in
my case I have almost 30000 lines).
Let's call a generic line "sLine" (with or without the white spaces
representing indentation).
To strip the un-necessary spaces you need to identify the operands.
Thanks to the advice of Alex Martelli, there is a parsing method based
on tokenize module, to achieve this:
import tokenize, cStringIO
try:
. for x in
tokenize.generate_tokens(cStringIO.StringIO(sLine).readline):
. . if x[0]==50:
. . . sLine=sLine.replace(' '+x[1],x[1])
. . . sLine=sLine.replace(x[1]+' ',x[1])
except tokenize.TokenError:
. pass
- x[0] is the 1st element of the x tuple, and 50 is the code for
OPERAND.
(For those who want to experiment on the x tuple, you can print it
merely by a
"print str(x)". You obtain as many tuples as the elements present in
the line).
- x[1] (the 2nd element of the x tuple) is the Operand itself.
The try/except is one of my bad habits:
the program fails if the line is a multiline.
Ask Alex... I haven't gone deeper.
At the end you have sLine with white spaces stripped...
There is yet a mistake...: this method strip white spaces also inside
strings.
(I don't care...).
A last word of caution: I haven't tested this extract from my
routine...
This small script is part of a bigger program: a cross-reference tool,
but don't ask me for that...
Bye.
More information about the Python-list
mailing list