Python and Jython inconsistencies when encoding strings

Mon Sep 9 03:43:52 EDT 2002

Thanks,

> sticking to ascii there can avoid some troubles :).

unfortunately it is not always possible :(

Cheers,
Andre

"Samuele Pedroni" <pedronis at bluewin.ch> wrote in message
news:3d78ce33$1_3 at news.bluewin.ch...
>
> Martin v. Löwis <loewis at informatik.hu-berlin.de> wrote in message
> j4ofbbclfp.fsf at informatik.hu-berlin.de...
> > >>> s
> > u"\u0153"
> >
> > Now, U+0153 is LATIN SMALL LIGATURE OE. It so happens that \x9c (what
> > the terminal sends) is U+0153 in CP 1252 (which is the ANSI code page
> > on your Windows installation). This might be a bug in Java, which
> > assumes that bytes sent by the terminal are in the ANSI code page,
> > when they are really in the OEM code page.
>
> no it's more the Jython parser that does that, things can be fixed running
> Jython as
>
> jython -Dpython.console.encoding=cp850
>
> on the other hand output seems buggy for:
>
> print s.encode("cp850")
>
> [I have reported that on our SF bug tracker]
>
>
>  > > Does anybody know what is causing this inconsistency? Is there any
way
> to
> > > avoid it?
> >
> > Yes. Don't use the console.
>
> sticking to ascii there can avoid some troubles :).
>
> regards
>
>