Tabs versus Spaces in Source Code

Mumia W. mumia.w.18.spam+fbi.gov at earthlink.net
Tue May 23 09:19:56 EDT 2006


Xah Lee wrote:
> the following are 2 FAQ following this thread. Thanks.
> 
> Addendum: 2006-05-15
> 
> Q: What you mean by embeding tab position info into the source code?
> How's that gonna be done?
> 
> A: Tech geekers may not realize, but such embedding of meta info do
> exist in many technologies by various means because of a need. For
> example, Mac OS Classic's resource fork and Mac OS X's bundling system,
> unix shell script's shebang (#!), emacs and Python's encoding
> declaration “#-*- coding: utf-8 -*-”, Unicode's BOM, CVS's
> change-log insertion, Mathematica's source code system the Notebook,
> Microsoft Word's transparent meta data, as well as HTML and XML's
> various declarations embedded in the file. Some of these systems are
> good designs and some are hacks.
> 

Vim's mode-lines do this too.

> Somehow tech geekers have the sense that “source code” must be a
> plain text file containing nothing else but the programing code. This
> may be a defendable position, but as we can see in the above examples,
> this idea is primitive and does not address the various needs. If the
> tech geekers have thought out about these issues, computing languages
> and its source code may have developed into more powerful and flexible
> integrated systems as the above standardized examples. 

The tech geekers have thought about it. Donald Knuth invented TeX, and 
went on to invent the WEB literate programming system. You don't get any 
geekier than that :)

 > For instance,
> many commercial development systems actually already have such
> meta-data embodied with the source code. (e.g. Borland Delphi,
> Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
> Mathematica.) Some of which, not only embody development-related info
> such as debug points or linking files, but also allow programers to
> high-light code for visual purposes like a word processor, or even
> display them visually as type-set mathematics.
> 
> Q: Converting spaces to tabs is actually easy. I don't see how spacess
> lose info.
> 
> A: Here is a illustration on how it is not possible to convert spaces
> to tabs. Suppose you are writing in a language where the indentation is
> part of the semantics, not just for appearance. Now, suppose you have
> these two lines:

I'd say that such a language removes the choice of whether to use tabs 
or spaces, and the discussion is over when you don't have a choice.

> 
> 1234567890
>   A
>     B
> 
> The first line has 2 space prefix and second line has 4 space prefix.
> How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
> or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
> represents, where n is the smallest space prefix in the code, unless n
> == 1.

   vim: tabstop=4

The argument for spaces over tabs says that you have to include some 
metadata in order for the document to look right on other people's 
computers if you use tabs. This example, plus my example mode-line for 
vim, reinforces that idea IMO.

> 
> The above demonstrates the information loss in using spaces for
> indentation in a theoretical way. There are also practical problems. In
> practice, many languages allow string literals like this myName="i love
> you", and strings easily can have a run of spaces. One cannot simply
> run a blind find-n-replace operation to replace all spaces to tabs. But
> also, many unix languages contains a so-called construct of
> “heredoc” as a mean to embed a literal block of text. For example,
> here's a PHP construct of heredoc:
> 
> $novelText = <<<arbitraryCharsHereAsDelimiter
>             (__)
>             (oo)
>      /-------\/
>     / |     ||
>    *  ||----||
>       ~~    ~~
> arbitraryCharsHereAsDelimiter;
> }
> 

Yes, there are lots of situations like this where you can't just 
willy-nilly convert between tabs and spaces. But even in this case shows 
that, if you use consistent tab widths, the text has a chance of 
surviving. I converted your little doggie to and from text with tab 
sizes of eight, and he survived. (I did it with tabs set to four too, 
and it worked.)


> Regardless of its design as a language construct, the purpose of
> “heredoc” is that it allows programers to easily embed a text (a
> large string), without worrying about the text containing sequence of
> characters that may be meaningful to the language. If a language has
> heredoc construct, then it is basically impossible to convert from
> spaces to tabs, as that will botch literal string embedded in heredoc.

Yes it would. Upon printing, if the terminal tab width was set to eight,
but the text conversion was done with tabs at four, bye bye doggie.

> However, it is less of a problem to convert tabs to spaces, because the
> frequency of spaces appearing in literal strings are far higher than
> literal tabs.
> 
> Another practical issue is error recovery. Suppose, one uses 4 spaces
> for a indentation. Now, it is not uncommon to see lines with odd number
> of space prefixes such as 7 or 10 out of common sloppiness. Such error
> would happen more often if spaces are used for indentation, and the
> essence is that tabs enforce a semantic association and is impossible
> to make a half-indentation.
> 

What I've learned is that, if I'm going to use tabs for indentation, I 
have to be consistent.

> Q: Well, i just like spaces because they are most compatible.
> 
> A: Sure, crass simplicity is always more compatible. Suppose a unixer
> will say, he doesn't like HTML because it is fret with problems and
> incompatibilities. He'd rather prefer plain text. And, indeed, a lot
> unixers seriously think that.
> 
> ---------------------------
> PS in the answer to the first question, i gave the following examples
> of IDE/Language that actually embed formatting info in the source code:
> Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
> Wolfram Research's Mathematica
> 

Perl's POD and Java's javadoc do it too.

> actually, i know Mathematica does, but i'm not quite sure about the
> other examples. So, my question is, does any one knows a language or
> IDE that actually allows the coder to manually highlight parts of the
> code and this highlight stick with the file upon reopening, as if a
> word processor?
> 
>    Xah
>    xah at xahlee.org
>http://xahlee.org/
> 
> Xah Lee wrote:
>> Tabs versus Spaces in Source Code
>> This post is archived at:
>> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
> 

I'm slowly moving into the "spaces" camp. After reading your earlier 
post on tabs vs. spaces and other people's responses, I began thinking 
about why I like tabs so much, and there is only one answer--backspace.

If I use tabs, when I backspace I go back to the previous tab position, 
which is what I want. With spaces, I have to hit the backspace key 
several times to get back. That's it--one feature is the only reason I 
like tabs, so I decided to investigate vim's features to see if vim 
would let me backspace to the previous tab position with one keystroke.

'Softtabstop' (sts) is the feature. I would have never thought to look 
for this feature without your post. Thanks again Xah.

Your posts are on topic, informative, engaging and necessary. Keep them 
coming Xah. :)



More information about the Python-list mailing list