Linguistically correct Python text rendering

David Opstad opstad at batnet.com
Wed Feb 25 10:01:52 EST 2004


In article <m3u11ffk1f.fsf at pc150.maths.bris.ac.uk>,
 Michael Hudson <mwh at python.net> wrote:

> But it seems to be impossible to programmatically determine which
> encoding the terminal being printed to at a given moment is using (and
> the user can fiddle this at run time).  If I'm wrong about this, I'd
> like to know.

The encoding issue is peripheral to my point; sorry if I wasn't clearer 
in my original message. It doesn't matter what the encoding is. The main 
issue is that for some writing systems (e.g. Arabic) simply outputting 
the characters in a Unicode string, irrespective of encoding, will 
produce garbled results.

> What more would you have us do?

Well, for those writing systems whose presentation forms are included in 
Unicode, how about a further processing step? So that at a minimum, if I 
start with an Arabic string like "abc" I can get out an Arabic string 
like "CBA" where bidi reordering has happened, and contextual 
substitution has been done. Then, outputting the processed Unicode 
string using stdout will work without further intervention (assuming a 
font for the writing system is present, of course).

It's probably irrational of me, I admit, but I'd love to see Python 
correctly render *any* Unicode string, not just the subsets requiring no 
reordering or contextual processing.

Dave



More information about the Python-list mailing list