[New-bugs-announce] [issue23050] Add Japanese legacy encodings

Tetsuya Morimoto report at bugs.python.org
Sun Dec 14 15:34:49 CET 2014


New submission from Tetsuya Morimoto:

This patch adds Japanese legacy encodings as below.
https://bitbucket.org/t2y/cpython/branches/compare/japanese-legacy-encoding..default

* eucjp_ms (euc-jp compatible with cp932)
* iso2022_jp_ms (yet another iso-2022-jp compatible with cp932, similar to cp50220)
* cp50220 (http://www.iana.org/assignments/charset-reg/CP50220)
* cp50221 (a variant of cp50220)
* cp50222 (a variant of cp50220)
* cp51932 (http://www.iana.org/assignments/charset-reg/CP51932)

Originally, these character encodings patch was created as result in IPA project in 2005, by Masayuki Moriyama. The result was contributed to several community: libiconv, glibc, perl, PHP, Ruby, PostgreSQL, MySQL, nkf. He had made a patch for Python 2.4.3 at that time, but somehow, no one worked to integrate. That's a crying shame.

These character encodings are legacy, but are still used. Lots of end-user don't care the character encoding. Unfortunately, for historical reason, e-mails are encoded with these legacy encodings on Japanese Windows platform. Actually, my customer recently reported about Mojibake since its e-mail data would be encoded with cp50220 (iso-2022-jp-ms).

References:

* About IPA: http://www.ipa.go.jp/english/about/summary.html
* Mojibake: http://en.wikipedia.org/wiki/Mojibake
* Java encoding names: http://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html

References in Japanese:

* Japanese Legacy Encoding Project: http://legacy-encoding.sourceforge.jp/wiki/
* Project details: http://www.ipa.go.jp/about/jigyoseika/05fy-pro/open/2005-1467d.pdf

----------
components: Library (Lib)
files: add-japanese-legacy-encoding1.patch
hgrepos: 285
keywords: patch
messages: 232638
nosy: ishimoto, naoki, t2y
priority: normal
severity: normal
status: open
title: Add Japanese legacy encodings
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file37447/add-japanese-legacy-encoding1.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue23050>
_______________________________________


More information about the New-bugs-announce mailing list