Right solution to unicode error?
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Fri Nov 9 05:06:05 EST 2012
Le jeudi 8 novembre 2012 21:42:58 UTC+1, Ian a écrit :
> On Thu, Nov 8, 2012 at 12:54 PM, <wxjmfauth at gmail.com> wrote:
>
> > Font has nothing to do here.
>
> > You are "simply" wrongly encoding your "unicode".
>
> >
>
> >>>> '\u2013'
>
> > '–'
>
> >>>> '\u2013'.encode('utf-8')
>
> > b'\xe2\x80\x93'
>
> >>>> '\u2013'.encode('utf-8').decode('cp1252')
>
> > '–'
>
>
>
> No, it seriously is the font. This is what I get using the default
>
> ("Raster") font:
>
>
>
> C:\>chcp 65001
>
> Active code page: 65001
>
>
>
> C:\>c:\python33\python
>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
>
> 32 bit (Intel)] on win32
>
> Type "help", "copyright", "credits" or "license" for more information.
>
> >>> '\u2013'
>
> '–'
>
> >>> import sys
>
> >>> sys.stdout.buffer.write('\u2013\n'.encode('utf-8'))
>
> –
>
> 4
>
>
>
> I should note here that the characters copied and pasted do not
>
> correspond to the glyphs actually displayed in my terminal window. In
>
> the terminal window I actually see:
>
>
>
> ΓÇô
>
>
>
> If I change the font to Lucida Console and run the *exact same code*,
>
> I get this:
>
>
>
> C:\>chcp 65001
>
> Active code page: 65001
>
>
>
> C:\>c:\python33\python
>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
>
> 32 bit (Intel)] on win32
>
> Type "help", "copyright", "credits" or "license" for more information.
>
> >>> '\u2013'
>
> '–'
>
>
>
> >>> import sys
>
> >>> sys.stdout.buffer.write('\u2013\n'.encode('utf-8'))
>
> –
>
> 4
>
>
>
> Why is the font important? I have no idea. Blame Microsoft.
---------
If you have something like this 'ΓÇô'; in
Unicode nomenclature:
>>> import unicodedata as ud
>>> for c in 'ΓÇô':
... ud.name(c)
...
'GREEK CAPITAL LETTER GAMMA'
'LATIN CAPITAL LETTER C WITH CEDILLA'
'LATIN SMALL LETTER O WITH CIRCUMFLEX'
it is a sign of a "cp437" somewhere.
>>> '\u2013'.encode('utf-8').decode('cp437')
'ΓÇô'
On Windows 7. I do not remember having once a "coding
of the caracters" issue on XP.
jmf
More information about the Python-list
mailing list