[I18n-sig] Literal strings

Paul Prescod paul@prescod.net
Thu, 01 Jun 2000 22:20:48 -0500


I am thinking about string literals. Not narrow strings in general, just
string literals in particular. I'm not sure where we left the issue of a
statement about the "encoding" of string literals. Here's my input.

I have a lot of code like this:

if tagName=="foo":
	...

I would like it to magically work with Unicode. Guido's proposal allows
it to magically work with Unicode-encoded ASCII, but not with the full
range of Unicode characters. I'm not entirely happy that my code will
crash and burn the first time someone pops in a cedilla.

What would be the consequences of a module-level pragma that allows the
literal strings in my module to be interpreted as *Unicode literals*
instead of ASCII literals. I usually know that all of the literals in my
program are raw ASCII, so even if they are interpreted as Unicode, they
will be "compatible with" raw ASCII input. The only thing that they
would not be compatible with is 8-bit binary goo, which they were never
intended to be compatible with anyhow.

I just want to add something at the top of my file like:

#pragma IL8N

and have my literal strings act as Unicode.

Now I could go through my code and change all of the literals to Unicode
literals by hand, but 

 a) that's really ugly, syntactically

 b) I feel like I'll end up switching them all back when we just make
literal strings "wide" by default

 c) I feel like I'm being penalized for making my program
internationalized

 d) I have a lot of code, as we all do.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Simplicity does not precede complexity, but follows it. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html