Linguistically correct Python text rendering

Wed Feb 25 12:52:16 EST 2004

David Opstad <opstad at batnet.com> writes:

> In article <m3u11ffk1f.fsf at pc150.maths.bris.ac.uk>,
>  Michael Hudson <mwh at python.net> wrote:
> 
> > But it seems to be impossible to programmatically determine which
> > encoding the terminal being printed to at a given moment is using (and
> > the user can fiddle this at run time).  If I'm wrong about this, I'd
> > like to know.
> 
> The encoding issue is peripheral to my point; sorry if I wasn't clearer 
> in my original message. It doesn't matter what the encoding is. The main 
> issue is that for some writing systems (e.g. Arabic) simply outputting 
> the characters in a Unicode string, irrespective of encoding, will 
> produce garbled results.
> 
> > What more would you have us do?
> 
> Well, for those writing systems whose presentation forms are included in 
> Unicode, how about a further processing step? So that at a minimum, if I 
> start with an Arabic string like "abc" I can get out an Arabic string 
> like "CBA" where bidi reordering has happened, and contextual 
> substitution has been done. Then, outputting the processed Unicode 
> string using stdout will work without further intervention (assuming a 
> font for the writing system is present, of course).

Ah, OK.  You are now officially beyond my level of expertise :-) You
might want to talk to the i18n-sig.

This sounds very much like the sort of thing that could/should be
developed externally to Python and then perhaps folded in later, a la
CJKCodecs.

> It's probably irrational of me, I admit, but I'd love to see Python 
> correctly render *any* Unicode string, not just the subsets requiring no 
> reordering or contextual processing.

I still think "render" is probably the wrong word to use here, though.

Cheers,
mwh

-- 
  In short, just business as usual in the wacky world of floating
  point <wink>.                        -- Tim Peters, comp.lang.python