[Python-3000] pep 3131 again

Thu May 17 05:14:21 CEST 2007

On May 16, 2007, at 9:06 PM, tomer filiba wrote:

> === RTL/LTR ===
> i pointed out already that no existing editor can handle LTR-RTL
> representation correctly, which essentially renders all RTL languages
> out of the scope of this PEP. that doesn't bother me personally so  
> much,
> as i'm not going to use this feature anyway, but that still leaves  
> us with
> the "european imposed colonialism" :)
>
> the only practical way to use RTL languages in code is to have an RTL
> programming language, where "if" is spelled "אם", "for" as  
> "עבור",
> "in" as "בתוך", and so on, and the entire program is RTL. having  
> code
> like --

> for קקי in פיפי(1,2,3)

> is only unreadable by all means (since the parenthesis are LTR, while
> the name is RTL, etc.)

It is interesting to contrast the rendering of that (ABC being  
substitutes for hebrew characters):
for ABB in 1,2,3)ACAC)

with the rendering of:
for קקי in פיפי(a,b,c)
as:
for ABB in ACAC(a,b,c)

This is I suppose due to numbers and punctuation having weak  
directionality in the bidi algorithm, which isn't really appropriate  
for tokens in a programming language. So yes, clearly, an editor that  
takes into account the special needs of programming languages is  
necessary to effectively write bidi code. But it's certainly not  
inconceivable, and I don't see that the non-existence of an effective  
bidi editor should influence the decision to allow unicode characters  
in python at all. For a majority of languages that are LTR, it is not  
an issue, and I have every confidence that the bidi programming  
editor problem will be solved at some point in the future. The only  
thing python can possibly do to help with this is to ignore any RLO/ 
LRO/LRE/RLE/PDF/RLM/LRM characters it sees during tokenization.  
(probably ought to ignore anything with the  
"Default_Ignorable_Code_Point" unicode property).

This would allow a smart editor to save the text with such formatting  
characters in it, so that other "dumb" viewers would not be confused.
For example, with explicit formatting added, rendering can be made  
correct:
for ‪קקי‬ in ‪פיפי‭‌(1,2,3)

http://imagic.weizmann.ac.il/~dov/Hebrew/logicUI24.htm#h1-25 shows  
someone has thought about this at least a little from the editor  
perspective...

James