This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: add support for cjkcodecs to Python email
Type: enhancement Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, jasonrm
Priority: normal Keywords:

Created on 2003-12-01 22:05 by jasonrm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Charset-1.diff jasonrm, 2003-12-01 22:06
Charset.py.diff barry, 2003-12-30 15:16
Messages (9)
msg19221 - (view) Author: Jason R. Mastaler (jasonrm) Date: 2003-12-01 22:05
As discussed last week on the email-sig
list, the attached patch adds support for
the CJKCodecs package as an alternative
to the {Chinese,Japanese,Korean}Codecs
packages.  CJKCodecs 1.0.2 and above
should work with this patch.

This is advantageous because the Chinese
and KoreanCodecs packages are no longer
supported,maintained or available for 
download.

This patch does not break compatibility
with {Chinese,Japanese,Korean}Codecs,
so they can still be used if desired.

Lastly, this patch fixes a small typo that
broke GB2312.
msg19222 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-29 14:49
Logged In: YES 
user_id=12800

Correct me if I'm wrong, but won't the attached patch work
better?  It simply removes the entries from CODEC_MAP that
are already provided by cjkcodecs.aliases (and
japanese.aliases and korean.aliases).

See Charset.py.diff
msg19223 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-29 14:52
Logged In: YES 
user_id=12800

Oops, with the typo fix for gb2312.
msg19224 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-29 15:19
Logged In: YES 
user_id=12800

One more rev of Charset.py.diff
msg19225 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-30 04:33
Logged In: YES 
user_id=12800

I'm not even sure this patch is correct, since it breaks the
test suite.  The problem is that self.output_codec ends up
being different with the patch than without it (in
Charset.__init__()).  For example:

Python 2.3.3 (#1, Dec 19 2003, 11:33:00) 
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more
information.
>>> from email.Charset import Charset
>>> c = Charset('euc-jp')
>>> c.output_codec
'japanese.iso-2022-jp'
>>> 

But now with Charset.py.diff applied:

...
>>> c.output_codec
'euc-jp'

We need to figure out what the right thing to do here is.
msg19226 - (view) Author: Jason R. Mastaler (jasonrm) Date: 2003-12-30 05:59
Logged In: YES 
user_id=85984

comments regarding Barry's Charset.py.diff:

You shouldn't mention KoreanCodecs and 
ChineseCodecs in the comments as
alternatives to CJKCodecs.  Both are no
longer maintained, or even available for
download.  Both have been completely
replaced by CJKCodecs.  Only 
JapaneseCodecs remains as a substitute
package.
msg19227 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-30 15:16
Logged In: YES 
user_id=12800

Latest version of the patch, with updated comments as per
Jason's followup, and including Tokio Kikuchi's fix for the
test suite regression.
msg19228 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2003-12-30 16:52
Logged In: YES 
user_id=12800

Applied to Python trunk (2.4).  This will be applied to
Python 2.3 and closed when that branch's freeze is lifted.
msg19229 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2004-05-10 15:04
Logged In: YES 
user_id=12800

Closing this without applying it to 2.3.  The general
consensus was not to mess with this in a bug fix release, so
it will have to wait until Python 2.4/email3.
History
Date User Action Args
2022-04-11 14:56:01adminsetgithub: 39645
2003-12-01 22:05:44jasonrmcreate