[ python-Bugs-1743795 ] Some incorrect national characters (Polish) in unicodedata

SourceForge.net noreply at sourceforge.net
Wed Jun 27 19:25:15 CEST 2007


Bugs item #1743795, was opened at 2007-06-26 18:45
Message generated for change (Comment added) made by admindomeny
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1743795&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: admindomeny (admindomeny)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Some incorrect national characters (Polish) in unicodedata

Initial Comment:
Hello,

This problem regards pythonwin (I haven't checked whether unix/commandline python is affected), Python 2.5.1.

Examples on attached screenshot.

E.g. print u'\N{LATIN SMALL LETTER A WITH CIRCUMFLEX}' prints wrong character (latin small a with some caret above it it seems) as well as 

print unicodedata.name( / latin small letter a with circumflex, typed in Windows using Polish "programmer's keyboard" / ) produces 'SUPERSCRIPT ONE', which is obviously incorrect.



----------------------------------------------------------------------

>Comment By: admindomeny (admindomeny)
Date: 2007-06-27 17:25

Message:
Logged In: YES 
user_id=1829093
Originator: YES

You were correct, the attached test file for Polish national characters
shows correctt character encodings when ran in Pythonwin and edited
correctly Unicode with Polish characters from Unicode Unicode.

The problem of entering characters in Pythonwin remains, however (OS: Win
XP SP2, Polish edition): I have tried changing fonts to what are Unicode
fonts as far as I know (Times New Roman, Arial, etc), including CE fonts as
well. It doesn't work.

I made sure that Polish Programmer's Keyboard is turned on which gives me
correct encoding in almost all Windows applications, including Unicode
editors like UniRed. Still, Pythonwin shell in particular thinks that
AltGr+a (standard way of entering 'LATIN SMALL LETTER A WITH OGONEK') is
actually 'SUPERSCRIPT ONE' for example.

So, to summarize:

1. IDLE edits the text in Unicode correctly provided there's a #-*-
coding: utf-8 -*- header in first line.

2. Pythonwin executes that file correctly.

3. Pythonwin enters national characters INCORRECTLY (at least as far
Polish is concerned, but I suspect it's also the case with other
languages).


File Added: test.py

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2007-06-27 08:28

Message:
Logged In: YES 
user_id=38388
Originator: NO

This sounds more like a problem with entry of Unicode characters in
pythonwin than the unicodedata module.

Please create a test.py file with the character using e.g. UTF-8 as source
code encoding and run that through the Python interpreter directly to see
if the problem persists.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1743795&group_id=5470


More information about the Python-bugs-list mailing list