Unicode usenet posting. This is a test.

Xah Lee xahlee at gmail.com
Sun Sep 26 18:20:33 EDT 2010


On Sep 26, 5:40 am, Spiros Bousbouras <spi... at gmail.com> wrote:
> > And just for good measure, some «European style quotes» and “balanced smart
> > quotes” which I intend some day to try to convince people to start using
> > to eliminate the scourge of backslash escapes.  But that's a topic for
> > another day.
>
> I don't see how they would help to eliminate backslash escapes. Let's
> imagine that strings were delimited by « and ». If you wanted a
> string which contained a » you would still need to escape it.

I thought about these, but ultimately I don't think it is avoidable.

However, using matching pairs eliminate many unnecessary escapes.

For example in my recent essay about html6, I thought about the escape
issue.  Ultimately, if your content refers to the language itself,
then you will need escapes, unless your language can switch to another
quoting mechanism.
For example, in perl:

"this"
'this'
q[this]
q(this)
q{this}

print <<'xyzxyz';
this
xyzxyz


Are all equivalent. (technically except the last one which contains
extra new line)

Note that it has variable quoting chars.  Similarly with Python. e.g.

"this"
'this'
"""this"""
'''this'''

In python , the variability is less than perl.

Now, suppose, if you need to quote perl code in perl itself or python
in python. e.g. suppose you are writing a perl script that parse a
perl snippet that contains all perl lang's quoting mechanisms, then,
basically you'll start to need escapes. (worse when this is nested. A
practical example that happens to me often is when writing a blog in
html about using perl to parse html, and the html contains complex
javascript or php, which may contain regex string, and you need to
show the perl source code in html marked up syntax highlight.)

in the study of symbolic logic, this is a form of self reference, and
is a unavoidable problem (to not to have to escape chars yet still
want the ability to quote the lang itself).

the variable quoting chars also introduces some complexity. Namely,
your lang at syntax level is no longer simple. e.g. in emacs lisp,
whenever the symbol straight double quote appears, it has only one
meaning (unless in special cases such as in comments or being
escaped).  Or, when you need to get strings in a lang, the only char
you need to look for is double straight quote.  In langs with variable
quotes such as perl, this can no longer be true, in one or both ways.

doesn't matter which is your philosophy in lang design with regards to
quoting mechanism, unicode introduce many proper matching pairs that
are helpful, and avoid multiple semantic meanings for a given char.

in a similar way, this is one of my pet peeve in math notation and
computer lang syntaxes. e.g. In Mathematica, paren is used for one
single purpose only, always. Namely, grouping. The square bracket []
has one single purpose only, namely as bracket for function arguments.
The curly brackets {} again has one single purpose only, as a syntax
sugar for list, e.g. List[1,2] is the same as {1,2}. In traditional
math notation and most comp langs, it's all context dependent soup.

• 〈Strings in Perl and Python〉
http://xahlee.org/perl-python/quoting_strings.html

• 〈Strings in PHP〉
http://xahlee.org/php/quoting_strings.html

• 〈HTML6, Your HTML/XML Simplified〉
http://xahlee.org/comp/html6.html

• 〈Matching Brackets in Unicode〉
http://xahlee.org/comp/unicode_matching_brackets.html

 Xah ∑ xahlee.org ☄



More information about the Python-list mailing list