Unicode failure

Oscar Benjamin oscar.j.benjamin at gmail.com
Sat Dec 5 05:46:20 EST 2015


On 5 Dec 2015 06:10, "D'Arcy J.M. Cain" <darcy at vybenetworks.com> wrote:
>
> On Fri, 4 Dec 2015 18:28:22 -0500
> Terry Reedy <tjreedy at udel.edu> wrote:
> > On 12/4/2015 1:07 PM, D'Arcy J.M. Cain wrote:
> > > I thought that going to Python 3.4 would solve my Unicode issues
> >
> > Within Python itself, that should be mostly true.  As soon as you
> > send text to a display, the rules of the display device take over.
>
> OK but my display (xterm) can display those characters.  I see it when
> I dump unicode text from my database.
>
> > > #! /usr/bin/python3
> > > # -*- coding: UTF-8 -*-
> >
> > Redundant, as this is the default for 3.x
>
> I assumed so but belt and suspenders, right?
>
> > Tk widgets, and hence IDLE windows, will print any character from
> > \u0000 to \uffff without raising, even if the result is blank or �.
> > Higher codepoints fail, but allowing the entire BMP is better than
> > any Windows codepage.
>
> Not sure I follow all this but to be clear, I am not using Tk, Idle or
> Windows.  I guess I should have mentioned that I am on Unix but I
> thought that the hash-bang would have given that away.  To be complete,
> I am running xterms on Xubuntu connected to NetBSD 7.0.  The data is
> coming from a PostgreSQL 9.3.5 database.  I am using a beta of PyGreSQL
> 5.0 (I am the lead developer for it) and I checked and the type
> returned is str, not bytes.  The database encoding is UTF8.

Yeah but the error you showed was from print trying to encode the string as
ASCII. For some reason Python thinks that stdout is ASCII I think. So I
repeat: what is SYS.stdout.encoding? If you're using xterm I think it will
be derived from LANG. So what's LANG?

--
Oscar



More information about the Python-list mailing list