Translating Javascript programs to python.

Terry Reedy tjreedy at udel.edu
Wed Aug 23 22:01:38 EDT 2006


"Vyz" <vyzasatya at gmail.com> wrote in message 
news:1156371324.186510.273930 at 75g2000cwc.googlegroups.com...
> Its a module to transliterate from telugu language  written in roman
> script into native unicode. right now its running in a browser window
> at www.lekhini.org I Intend to translate it into python so that I can
> integrate into other tools I have. I am able to pass arguments and get
> output from the script also would be OK. or how about ways to wrap
> these javascript functions with python.

Leaving aside the code the manipulated the display and user interaction, 
the code should be pretty straightforward logic (if-else statements) and 
table lookups, so translation to Python should be straightforward also.

I checked parser.js.  I don't know javascript but it looks to me like a 
mixture of C and Python.  The for loop headers have to be rewritten, and 
the switch changed to if-elif.  What looks different is the attachment as 
attributes of method functions to functions rather than classes.

As for 'wrapping': can you get a standard javascript interpreter?  If so, 
you could possibly adjust the js so you can pipe a roman string to the js 
program and have it pipe back the telegu unicode version.

>> > I have a script with hundreds of lines of javascript spread accross 7
>> > files. Is there any tool out there to automatically or
>> > semi-automatically translate the code into python.

unicode.js is mostly a few hundred verbose lines like

Unicode.codePoints[Padma.lang_TELUGU].letter_PHA  = "\u0C2B";

that setup the translation dict.  Because the object model is different, I 
suspect that these all need to be changed, but, I also suspect, in a 
mechanical way.

If one were starting in Python, one might either just define a dict more 
compactly like
  TEL_uni = {letter_PHA:"\u0C2B", ...}
*or* probably better, use the builtin unicodedata module as much as 
possible.

>>> import unicodedata as u
>>> pha = u.name(u'\u0c2b')
>>> pha
'TELUGU LETTER PHA'
>>> u.lookup(pha)
u'\u0c2b'

I don't know what you do with js statement like this:
Unicode.toPadma[Unicode.codePoints[Padma.lang_TELUGU].misc_VIRAMA + 
Unicode.codePoints[Padma.lang_TELUGU].letter_KA] = Padma.vattu_KA;
where a constant seems to be assigned to a sum.  But whatever these do 
might correspond to the u.normalize function.

This appears to be based on a generic Indian-script transliteration program 
(Padma), so there may be functions not really needed for Telegu.  (I am 
familiar with Devanagri but know nothing of Telegu and its script except 
that it is Dravidian rather than Indo-European-Sanskritic.)

Good luck.

Terry Jan Reedy







More information about the Python-list mailing list