[Pythonmac-SIG] space indented files

Oliver Steele steele@cs.brandeis.edu
Thu, 14 Oct 1999 06:24:11 -0400


Just van Rossum <just@letterror.com> writes:
> 2. Non-tab indentation
> I might prefer to solve this as a file read/write filter which will
> transform any indentation into tab based indentation upon read and
> transform to the original style of indentation upon write. I would leave
> any non-indenting whitespace as it is. Would this be acceptable or would
> Oliver Steele's approach be preferred?

Let's call the approach that recognizes space-runs as equivalent to tabs,
"space recognition".  Let's call the approach that translates space-based
indentation into tab-based indentation, and back, "space translation".
"Space recognition" is what emacs does, and it's what you call "Oliver
Steele's approach" above.  "Space translation" is the method that you
"might prefer to solve this as".

I've actually implemented both.  SpaceTranslationPatch.py implements space
TRANSLATION:  it "makes the IDE change spaces in space-indented files to
tabs when it opens them, and back to spaces when it saves a file that was
converted this way."  PythonSpaceEditor.py implements space RECOGNITION:  it
"makes the IDE work with space-indented source files, by changing its
indentation-related commands to recognize that a sequence of spaces is
equivalent to a tab."

After using them both, I prefer the space translation approach, in
SpaceTranslationPatch.py.  (This is the same preference as Just's.)  I
initially implemented space recognition because I thought it would be good
to stick to a known design -- that is, to copy what emacs does, because it
works -- and because it's guaranteed to preserve the contents of the file
(opening and then saving a file with the space transformation approach may
change some spaces to tabs, if a file has both).  But it's too annoying on a
Mac to click in what looks like a tab stop and find yourself in the middle
of a set of spaces, or to use the left or right arrows or the backspace to
move what looks like one tab and find yourself moving only a space.  These
problems could be fixed with additional work, but at that point the text
presentation has diverged from the underlying data so much that I think it's
better just to change the underlying data instead, and keep the connection
between it and its presentation direct.  (Emacs users live with these
problems, but they seem worse in a Mac environment, maybe because I'm used
to a higher standard for direct manipulation)

SpaceTranslationPatch.py's algorithm is this:
(1) When an editor window is opened for a file, detect whether it contains
the string '\r    ' (a line that starts with four spaces).
(2) If it does, assume that it's a space-indented file.  If the file was
saved with Python font preferences, read the tab width from that.  Otherwise
if there are lines that start after eight spaces but not after four, assume
the tab stop is at 8 spaces.  Otherwise assume it's at four.
(3) Set the editor tab width to this.
(4) Replace all runs of tabstop*n spaces at the beginning of a line, with n
tabs.  Also, set a flag on the editor window, that marks that the conversion
was performed.
(5) When an editor window is saved, if this flag is set, replace all runs or
n tabs at the beginning of a line with tabstop*n spaces.  This eases
interoperability with other platforms that uses spaces to indent.

Remaining work:
- recognize files indented with a mixture of tabs and spaces
- test the heuristics on more files
- add a user indication and user control for the translation (I probably
won't do this one, but if it gets added to the IDE someone might want to do
this)