[ python-Bugs-1448490 ] Convertion error for latin1 characters with iso-2022-jp-2

SourceForge.net noreply at sourceforge.net
Mon Mar 13 11:27:21 CET 2006


Bugs item #1448490, was opened at 2006-03-13 06:57
Message generated for change (Comment added) made by perky
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1448490&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Francois Duranleau (duranlef)
>Assigned to: Hye-Shik Chang (perky)
Summary: Convertion error for latin1 characters with iso-2022-jp-2

Initial Comment:
It seems like there are some errors while reading a
text file encoded with ISO-2022-JP-2 using the codecs
module. In all my test cases, all latin1 characters
with an accent (e.g. e acute) do not appear in the
output string. However, if I convert the file manually
using iconv, I get everything right. Here is a simple
script that will illustrate the problem:

###########################################

import codecs

import pygtk
import gtk

f = codecs.open( "test.iso-2022-jp-2" , "r" , \
                 "iso-2022-jp-2" )
s1 = f.readline().strip()
f.close()

f = open( "test.utf-8" , "r" )
s2 = f.readline().strip()

pack = gtk.VBox()
pack.pack_start( gtk.Label( s1 ) )
pack.pack_start( gtk.Label( s2 ) )

window = gtk.Window( gtk.WINDOW_TOPLEVEL )
window.add( pack )
window.show_all()

def event_destroy( widget , event , data ) :
    gtk.main_quit()
    return 0

window.connect( "delete_event" , \
                lambda w,e,d: False , None )
window.connect( "destroy" , event_destroy , None )

gtk.main()

###########################################

I put the file "test.iso-2022-jp-2" in attachment. To
create the UTF-8 version of the file, I used the
following shell command:

iconv -f ISO-2022-JP-2 -t UTF-8 \
    test.iso-2022-jp-2 > test.utf-8

When running this script, I would actually expect a
window with two times the same label. However, the
first one is missing the e acute.

--
Francois

----------------------------------------------------------------------

>Comment By: Hye-Shik Chang (perky)
Date: 2006-03-13 19:27

Message:
Logged In: YES 
user_id=55188

Fixed in SVN (trunk:r42989, release24-maint:42991).
Thank you for the report!


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1448490&group_id=5470


More information about the Python-bugs-list mailing list