Allowing non-ASCII identifiers

Wed Feb 11 19:18:16 EST 2004

Dietrich Epp wrote:

> You could require that all identifiers be the canonically decomposed 
> Unicode representations encoded into UTF-8.  This would mean that no 
> matter which string is chosen from the above, the result is always the 
> same sequence of characters.  This is how many filesystems use unicode, 
> i.e., Mac HFS+ works this way (but filesystems usually also require a 
> specific version of Unicode for backwards compatibility).
There are several "Normal forms" for Unicode letters.  You'd need to
choose one.

> I personally think that Unicode identifiers would be catastrophic.....
{lotsa examples, some good, some not-so-good elided)
I'm reluctant to endorse it because I _know_ I'll see "Why doesn't my
program work?" accompanied by characters I'm not used to distinguishing.

> I think the assumption some people have is that Unicode will only ever 
> be used for things that are like the roman alphabet: adding diacritical 
> marks, etc.  It sounds like the most worthless extension ever, and the 
> only language I think of when I think of special characters is Intercal.  
And this is why I had to comment.  You obviously never dealt with APL.
I actually used it without an APL type ball, which was painful in the
extreme.  When I give language summaries, my quote for APL is,
"APL is the only language where you regularly see one programmer walk
into another's office (well, cube now, but in the day....) and say,
'I bet you cannot guess what this one-line program does.'"

-- 
-Scott David Daniels
Scott.Daniels at Acm.Org