[Doc-SIG] Non-breaking spaces

Tony J Ibbs (Tibs) tony@lsl.co.uk
Wed, 8 Aug 2001 09:55:25 +0100


David Goodger wrote:
> Now the question is how to represent it internally. I suppose we could
> store all strings as Unicode internally, and use the Unicode
> non-breaking space character. I haven't gotten into the Unicode
> encodings, so I'm inclined to put this one off (or let someone else
> implement it -- someone who cares ;-).

I think that the following would be useful:

1. Put a note on the description of how character escaping
   (i.e., use of "\\") works, that *at some time in the
   future* an escaped space may be a non-breaking space.

2. *Perhaps* consider noting that this *might* also be
   true for an escaped newline (although I'm much more
   doubtful of this).

There's no real reason for people to escape such things unless they
*are* doing something odd, so this should be safe enough.

Then, sometime down the line, I might try to implement it.

As for representation - chunk up the sequence of adjacent non-breaking
spaces into one node in the DOM tree, and make it be text with the
"retain spaces as spaces" attribute set (damn, can't remember it, but
it's a standard thing to do - q.v. how HTML handles literals).

(inside your *own* code, I'd be tempted to have a node that was called
"non_breaking_spaces" and had an integer stored therein! - at least at
first hack)

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)