From jason at mastaler.com Mon Dec 1 17:12:15 2003 From: jason at mastaler.com (Jason R. Mastaler) Date: Mon Dec 1 17:12:23 2003 Subject: [Email-SIG] Re: support CJKCodecs in Charset.py References: <1069733653.31869.24.camel@anthem> <3FC3EF88.1080208@is.kochi-u.ac.jp> Message-ID: Jason R. Mastaler writes: > As {CJK,Japanese,Korean}Codecs have their own .pth file that > registers their encoding aliases, there's no need to use a specific > module prefix like 'japanese.' or 'korean.' as is currently done in > Charset.py. If we remove these prefixes there won't be an > incompatibility issue with {CJK,Japanese,Korean}Codecs. The only > compatibility issue will be with ChineseCodecs which will no longer > be supported. It appears that CJKCodecs 1.0.2 has just been released and includes compatibility aliases for ChineseCodecs, so there are now no incompatibility issues with CJKCodecs and {Chinese,Japanese,Korean}Codecs. I've just posted a diff against Charset.py on the SF tracker to add CJKCodecs support. From andrew at logicalprogression.net Wed Dec 24 14:45:28 2003 From: andrew at logicalprogression.net (Andrew Veitch) Date: Wed Dec 24 14:45:40 2003 Subject: [Email-SIG] Re: A suggestion: HTML stripping Message-ID: Barry - We're using 'stripogram' for this from http://www.zope.org/Members/chrisw/StripOGram It's surprisingly difficult to convert HTML to plain text but stripogram works pretty well - there's still one of two circumstances where what it produces isn't spot on. Seasons greetings to all & thanks very much for a great library. Andrew -- Logical Progression Ltd, 20 Forth Street, Edinburgh EH1 3LH, UK Developers of MailManager: http://mailmanager.sourceforge.net/ Tel: +44 (0)131 550 3733 Web: http://www.logicalprogression.net/ From barry at python.org Mon Dec 29 08:53:11 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 08:53:16 2003 Subject: [Email-SIG] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <3FEFE51D.6010205@is.kochi-u.ac.jp> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> Message-ID: <1072705990.1796.38.camel@anthem> [A discussion about replacing JapaneseCodecs and KoreanCodecs in Mailman 2.1.4 with CJKCodecs] On Mon, 2003-12-29 at 03:26, Tokio Kikuchi wrote: > Sorry again Barry. > > We have to keep JapaneseCodecs and KoreanCodecs in the ditribution > and install in the pythonlib directory because email package designate > japanese and korean as prefix of charsets. I will have to study more > on cjkcodecs behavior (looks like japanese part has old bug in earlier > distribution of JapaneseCodecs) so please cancel this checkin. Oh dang. The problem is CODEC_MAP in email/Charset.py, right? Here's a hack for Mailman 2.1.4: -----japanese.py from cjkcodecs import euc-jp, iso-2022-jp, shift_jis -----korean.py from cjkcodecs import euc-kr, cp949, iso-2022-kr, johab We add these two files to Mailman's pythonlib, and then the imports in Charset.py should work correctly. It would be nice if cjkcodecs provided backwards compatibility. Otherwise, we probably want to provide some ourselves in email/Charset.py. I'm not sure there's a better way to do this, but attached is a strawman (untested) patch for email 2.5.5/Python 2.3.4. It's too late to get this into Python 2.3.3, but if this is acceptable, I can check this in for Python 2.3.4, and cut a new email package tarball for Mailman 2.1.4, forgoing the above hack. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: Charset.py.diff Type: text/x-patch Size: 2846 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20031229/5f477d83/Charset.py.bin From barry at python.org Mon Dec 29 09:10:24 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 09:10:29 2003 Subject: [Email-SIG] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072705990.1796.38.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> Message-ID: <1072707023.1796.47.camel@anthem> On Mon, 2003-12-29 at 08:53, Barry Warsaw wrote: > It would be nice if cjkcodecs provided backwards compatibility. > Otherwise, we probably want to provide some ourselves in > email/Charset.py. I'm not sure there's a better way to do this, but > attached is a strawman (untested) patch for email 2.5.5/Python 2.3.4. Amend that. If I understand how all this works correctly, then importing cjkcodecs.aliases provides direct mapping for all the charsets. So since we already have "import cjkcodecs.aliases" in Mailman's paths.py, we could just delete euc-jp, iso-2022-jp, shift_jis, euc-kr, iso-2022-kr, ks_c_5601-1987, and johab from CODEC_MAP and be done with it. It looks like we didn't need these aliases in CODEC_MAPS even with the older codec packages, since they define all the aliases as well. -Barry From barry at python.org Mon Dec 29 09:12:11 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 09:12:17 2003 Subject: [Email-SIG] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072705990.1796.38.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> Message-ID: <1072707131.1796.50.camel@anthem> Will this updated patch work? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: Charset.py.diff Type: text/x-patch Size: 1953 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20031229/a6f972a6/Charset.py.bin From perky at i18n.org Mon Dec 29 09:41:46 2003 From: perky at i18n.org (Hye-Shik Chang) Date: Mon Dec 29 09:47:15 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072705990.1796.38.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> Message-ID: <20031229144146.GA836@i18n.org> On Mon, Dec 29, 2003 at 08:53:11AM -0500, Barry Warsaw wrote: > [A discussion about replacing JapaneseCodecs and KoreanCodecs in Mailman > 2.1.4 with CJKCodecs] > > On Mon, 2003-12-29 at 03:26, Tokio Kikuchi wrote: > > Sorry again Barry. > > > > We have to keep JapaneseCodecs and KoreanCodecs in the ditribution > > and install in the pythonlib directory because email package designate > > japanese and korean as prefix of charsets. I will have to study more > > on cjkcodecs behavior (looks like japanese part has old bug in earlier > > distribution of JapaneseCodecs) so please cancel this checkin. I just got a mail that describes problems on CJKCodecs' iso-2022-jp codec from a Japanese user. I'm investigating it and I plan to release new minor revision that fixes the problems soon. BTW, I think shift-jis and euc-jp codec of CJKCodecs 1.0.2 is stable and backward-compatible enough. > Oh dang. > > The problem is CODEC_MAP in email/Charset.py, right? There's a bug report by Jason R. Mastaler already: http://www.python.org/sf/852347 > Here's a hack for Mailman 2.1.4: > > -----japanese.py > from cjkcodecs import euc-jp, iso-2022-jp, shift_jis and iso_2022_jp_1 > -----korean.py > from cjkcodecs import euc-kr, cp949, iso-2022-kr, johab > > We add these two files to Mailman's pythonlib, and then the imports in > Charset.py should work correctly. Yup. it will. :) > It would be nice if cjkcodecs provided backwards compatibility. > Otherwise, we probably want to provide some ourselves in > email/Charset.py. I'm not sure there's a better way to do this, but > attached is a strawman (untested) patch for email 2.5.5/Python 2.3.4. CJKCodecs already have enough compatibility aliases for consumer programs except that uses 'japanese.' or 'korean.' prefix explicitly. It has compatibility aliases for ChineseCodecs also. Hye-Shik From perky at i18n.org Mon Dec 29 09:44:09 2003 From: perky at i18n.org (Hye-Shik Chang) Date: Mon Dec 29 09:47:16 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072707023.1796.47.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <1072707023.1796.47.camel@anthem> Message-ID: <20031229144409.GA1156@i18n.org> On Mon, Dec 29, 2003 at 09:10:24AM -0500, Barry Warsaw wrote: > On Mon, 2003-12-29 at 08:53, Barry Warsaw wrote: > > > It would be nice if cjkcodecs provided backwards compatibility. > > Otherwise, we probably want to provide some ourselves in > > email/Charset.py. I'm not sure there's a better way to do this, but > > attached is a strawman (untested) patch for email 2.5.5/Python 2.3.4. > > Amend that. If I understand how all this works correctly, then > importing cjkcodecs.aliases provides direct mapping for all the > charsets. So since we already have "import cjkcodecs.aliases" in > Mailman's paths.py, we could just delete euc-jp, iso-2022-jp, shift_jis, > euc-kr, iso-2022-kr, ks_c_5601-1987, and johab from CODEC_MAP and be > done with it. > > It looks like we didn't need these aliases in CODEC_MAPS even with the > older codec packages, since they define all the aliases as well. It's true. But except for ChineseCodecs. Hye-Shik From barry at python.org Mon Dec 29 09:57:03 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 09:57:09 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <20031229144146.GA836@i18n.org> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> Message-ID: <1072709823.1796.69.camel@anthem> On Mon, 2003-12-29 at 09:41, Hye-Shik Chang wrote: > There's a bug report by Jason R. Mastaler already: > http://www.python.org/sf/852347 Ah yes, I'd forgotten about that, thanks. I've followed up to that tracker item now. > CJKCodecs already have enough compatibility aliases for consumer > programs except that uses 'japanese.' or 'korean.' prefix explicitly. > It has compatibility aliases for ChineseCodecs also. Cool. So if the Charset.py.diff patch in the tracker above looks good to you, I'll commit that as soon as Python's release23-maint branch freeze is lifted. Then I'll cut email 2.5.5 and add that to Mailman 2.1.4. Sound good? -Barry From barry at python.org Mon Dec 29 09:58:26 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 09:58:30 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <20031229144409.GA1156@i18n.org> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <1072707023.1796.47.camel@anthem> <20031229144409.GA1156@i18n.org> Message-ID: <1072709905.1796.73.camel@anthem> On Mon, 2003-12-29 at 09:44, Hye-Shik Chang wrote: > > It looks like we didn't need these aliases in CODEC_MAPS even with the > > older codec packages, since they define all the aliases as well. > > It's true. But except for ChineseCodecs. Since we didn't have any prefixes except japanese and korean, I don't think we're in any worse shape for ChineseCodecs. Right? -Barry From barry at python.org Mon Dec 29 10:05:14 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 10:05:30 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <20031229144146.GA836@i18n.org> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> Message-ID: <1072710313.1796.76.camel@anthem> On Mon, 2003-12-29 at 09:41, Hye-Shik Chang wrote: > I just got a mail that describes problems on CJKCodecs' iso-2022-jp > codec from a Japanese user. I'm investigating it and I plan to > release new minor revision that fixes the problems soon. Oh yes, please let me know asap when this is ready. This is the last issue I need to clear up before I release Mailman 2.1.4, which /will/ happen before the end of this year. I'd like for that to be ready tomorrow (Tuesday 30-Dec) if possible. -Barry From perky at i18n.org Mon Dec 29 10:12:44 2003 From: perky at i18n.org (Hye-Shik Chang) Date: Mon Dec 29 10:12:49 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072709823.1796.69.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <1072709823.1796.69.camel@anthem> Message-ID: <20031229151244.GA1646@i18n.org> On Mon, Dec 29, 2003 at 09:57:03AM -0500, Barry Warsaw wrote: > On Mon, 2003-12-29 at 09:41, Hye-Shik Chang wrote: > > > There's a bug report by Jason R. Mastaler already: > > http://www.python.org/sf/852347 > > Ah yes, I'd forgotten about that, thanks. I've followed up to that > tracker item now. > > > CJKCodecs already have enough compatibility aliases for consumer > > programs except that uses 'japanese.' or 'korean.' prefix explicitly. > > It has compatibility aliases for ChineseCodecs also. > > Cool. So if the Charset.py.diff patch in the tracker above looks good > to you, I'll commit that as soon as Python's release23-maint branch > freeze is lifted. Then I'll cut email 2.5.5 and add that to Mailman > 2.1.4. > > Sound good? > Okay for me. BTW, if no aliases with same key and value is needed, can't a line below the alises removed together? : 'utf-8': 'utf-8', Thanks! Hye-Shik From barry at python.org Mon Dec 29 10:19:42 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 10:19:51 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <20031229151244.GA1646@i18n.org> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <1072709823.1796.69.camel@anthem> <20031229151244.GA1646@i18n.org> Message-ID: <1072711181.1796.83.camel@anthem> On Mon, 2003-12-29 at 10:12, Hye-Shik Chang wrote: > Okay for me. BTW, if no aliases with same key and value is needed, > can't a line below the alises removed together? : > > 'utf-8': 'utf-8', Good catch, thanks! -Barry From tkikuchi at is.kochi-u.ac.jp Mon Dec 29 20:45:50 2003 From: tkikuchi at is.kochi-u.ac.jp (Tokio Kikuchi) Date: Mon Dec 29 20:46:03 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <20031229144146.GA836@i18n.org> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> Message-ID: <3FF0D8CE.5010701@is.kochi-u.ac.jp> Hi, All. >>Oh dang. >> >>The problem is CODEC_MAP in email/Charset.py, right? > > > There's a bug report by Jason R. Mastaler already: > http://www.python.org/sf/852347 > > >>Here's a hack for Mailman 2.1.4: >> >>-----japanese.py >>from cjkcodecs import euc-jp, iso-2022-jp, shift_jis > This will not do. (Syntax error!) My fault is that I have separately installed both JapaneseCodecs and cjkcodecs in the python site-packages area. Looks like mailman has looked the site-package codecs before mailman/pytholib codecs. Since we can get rid of the aliases in Charset.py, woud it not be better to leave the package installation to the indivisual site owners? Some Japanese users looks like to prefer JapaneseCodecs than cjkcodecs and some even prefer using one which override special characters like full-width roman numerics. Barry, I again suggest cancelling this commit for cjkcodecs altogether in the meantime of releasing 2.1.4. -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/ From barry at python.org Mon Dec 29 23:00:50 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 23:00:58 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <3FF0D8CE.5010701@is.kochi-u.ac.jp> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> Message-ID: <1072756849.9216.15.camel@anthem> On Mon, 2003-12-29 at 20:45, Tokio Kikuchi wrote: > >>-----japanese.py > >>from cjkcodecs import euc-jp, iso-2022-jp, shift_jis > > > > This will not do. (Syntax error!) I noticed that. ;) Change the dashes to underscores. > My fault is that I have separately installed both JapaneseCodecs > and cjkcodecs in the python site-packages area. Looks like mailman > has looked the site-package codecs before mailman/pytholib codecs. Hmm, it shouldn't. Mailman /should/ be set up to look in pythonlib first. > Since we can get rid of the aliases in Charset.py, woud it not be better > to leave the package installation to the indivisual site owners? Perhaps, but 1) I think Mailman should come with batteries included and be easy to install, 2) I don't want to rely on having to install these packages in the system's site-packages directory because that affects all users of Python on that system. > Some Japanese users looks like to prefer JapaneseCodecs than cjkcodecs > and some even prefer using one which override special characters like > full-width roman numerics. Hmm. I have to defer to you on this. In general though, it's a shame there has to be more than one codec package for Japanese. Also, is JapaneseCodecs still being developed? > Barry, I again suggest cancelling this commit for cjkcodecs altogether > in the meantime of releasing 2.1.4. Looks like we'll have to. The other problem is that I can't make the necessary changes to the email package until the Python 2.3 branch is freed up and it doesn't look that that will happen in time. I don't want to include an unreleased version of the email package with Mailman 2.1.4. So we'll stick with the status quo for Mailman 2.1.4. It would really be nice if Python 2.4 included the Asian codecs by default. -Barry From barry at python.org Mon Dec 29 23:35:04 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 23:35:13 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072756849.9216.15.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> <1072756849.9216.15.camel@anthem> Message-ID: <1072758904.9216.24.camel@anthem> On Mon, 2003-12-29 at 23:00, Barry Warsaw wrote: > Looks like we'll have to. The other problem is that I can't make the > necessary changes to the email package until the Python 2.3 branch is > freed up and it doesn't look that that will happen in time. I don't > want to include an unreleased version of the email package with Mailman > 2.1.4. Besides, my patch for Charset.py breaks Python's test suite. I'm not yet sure what the right way to fix this is. http://sourceforge.net/tracker/index.php?func=detail&aid=852347&group_id=5470&atid=105470 -Barry From barry at python.org Mon Dec 29 23:48:01 2003 From: barry at python.org (Barry Warsaw) Date: Mon Dec 29 23:48:12 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <3FF0D8CE.5010701@is.kochi-u.ac.jp> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> Message-ID: <1072759680.9216.28.camel@anthem> On Mon, 2003-12-29 at 20:45, Tokio Kikuchi wrote: > Barry, I again suggest cancelling this commit for cjkcodecs altogether > in the meantime of releasing 2.1.4. I've done this now in Mailman's cvs (Release_2_1-maint branch). Please double check. -Barry From perky at i18n.org Tue Dec 30 01:06:22 2003 From: perky at i18n.org (Hye-Shik Chang) Date: Tue Dec 30 01:06:26 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072710313.1796.76.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <1072710313.1796.76.camel@anthem> Message-ID: <20031230060622.GA21672@i18n.org> On Mon, Dec 29, 2003 at 10:05:14AM -0500, Barry Warsaw wrote: > On Mon, 2003-12-29 at 09:41, Hye-Shik Chang wrote: > > > I just got a mail that describes problems on CJKCodecs' iso-2022-jp > > codec from a Japanese user. I'm investigating it and I plan to > > release new minor revision that fixes the problems soon. > > Oh yes, please let me know asap when this is ready. This is the last > issue I need to clear up before I release Mailman 2.1.4, which /will/ > happen before the end of this year. I'd like for that to be ready > tomorrow (Tuesday 30-Dec) if possible. All the problems on iso-2022-jp* codecs are fixed and a release candidate for CJKCodecs 1.0.3 is ready. (anyway :-)) http://people.freebsd.org/~perky/cjkcodecs-1.0.3c1.tar.bz2 I'll release 1.0.3 final in a day or two. Hye-Shik From tkikuchi at is.kochi-u.ac.jp Tue Dec 30 07:12:25 2003 From: tkikuchi at is.kochi-u.ac.jp (Tokio Kikuchi) Date: Tue Dec 30 07:12:55 2003 Subject: [Email-SIG] Re: [Mailman-i18n] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072758904.9216.24.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> <1072756849.9216.15.camel@anthem> <1072758904.9216.24.camel@anthem> Message-ID: <3FF16BA9.7090508@is.kochi-u.ac.jp> I think this patch will fix. @@ -221,6 +220,8 @@ # it. henc, benc, conv = CHARSETS.get(self.input_charset, (SHORTEST, BASE64, None)) + if not conv: + conv = self.input_charset # Set the attributes, allowing the arguments to override the default. self.header_encoding = henc self.body_encoding = benc @@ -230,7 +231,7 @@ self.input_codec = CODEC_MAP.get(self.input_charset, self.input_charset) self.output_codec = CODEC_MAP.get(self.output_charset, - self.input_codec) + self.output_charset) def __str__(self): return self.input_charset.lower() Sorry for folding. Barry Warsaw wrote: > On Mon, 2003-12-29 at 23:00, Barry Warsaw wrote: > > >>Looks like we'll have to. The other problem is that I can't make the >>necessary changes to the email package until the Python 2.3 branch is >>freed up and it doesn't look that that will happen in time. I don't >>want to include an unreleased version of the email package with Mailman >>2.1.4. > > > Besides, my patch for Charset.py breaks Python's test suite. I'm not > yet sure what the right way to fix this is. > > http://sourceforge.net/tracker/index.php?func=detail&aid=852347&group_id=5470&atid=105470 > > -Barry > > > > _______________________________________________ > Mailman-i18n mailing list > Posts: Mailman-i18n@python.org > Unsubscribe: http://mail.python.org/mailman/options/mailman-i18n/tkikuchi%40is.kochi-u.ac.jp > > -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/ From tkikuchi at is.kochi-u.ac.jp Tue Dec 30 07:15:09 2003 From: tkikuchi at is.kochi-u.ac.jp (Tokio Kikuchi) Date: Tue Dec 30 07:15:25 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072759680.9216.28.camel@anthem> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> <1072759680.9216.28.camel@anthem> Message-ID: <3FF16C4D.1020509@is.kochi-u.ac.jp> Looks OK. I tested some messages without any codecs in site-package. -- Tokio Barry Warsaw wrote: > On Mon, 2003-12-29 at 20:45, Tokio Kikuchi wrote: > > >>Barry, I again suggest cancelling this commit for cjkcodecs altogether >>in the meantime of releasing 2.1.4. > > > I've done this now in Mailman's cvs (Release_2_1-maint branch). Please > double check. > > -Barry > > > From anthony at interlink.com.au Tue Dec 30 08:10:17 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Dec 30 08:10:53 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072756849.9216.15.camel@anthem> Message-ID: <200312301310.hBUDAHOC027873@localhost.localdomain> >>> Barry Warsaw wrote > Looks like we'll have to. The other problem is that I can't make the > necessary changes to the email package until the Python 2.3 branch is > freed up and it doesn't look that that will happen in time. I don't > want to include an unreleased version of the email package with Mailman > 2.1.4. In any case, the 2.3 branch is in feature freeze now (has been for quite some time) so it's not likely that this sort of new functionality is acceptable on the 2.3 branch. Anthony (wearing the harsh release manager hat). -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Tue Dec 30 10:17:11 2003 From: barry at python.org (Barry Warsaw) Date: Tue Dec 30 10:17:20 2003 Subject: [Email-SIG] Re: [Mailman-i18n] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <3FF16BA9.7090508@is.kochi-u.ac.jp> References: <3FEF9259.3030809@is.kochi-u.ac.jp> <3FEFE51D.6010205@is.kochi-u.ac.jp> <1072705990.1796.38.camel@anthem> <20031229144146.GA836@i18n.org> <3FF0D8CE.5010701@is.kochi-u.ac.jp> <1072756849.9216.15.camel@anthem> <1072758904.9216.24.camel@anthem> <3FF16BA9.7090508@is.kochi-u.ac.jp> Message-ID: <1072797430.9216.50.camel@anthem> On Tue, 2003-12-30 at 07:12, Tokio Kikuchi wrote: > I think this patch will fix. Works for me, thanks. I've updated the tracker item. -Barry From barry at python.org Tue Dec 30 10:23:08 2003 From: barry at python.org (Barry Warsaw) Date: Tue Dec 30 10:23:20 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <200312301310.hBUDAHOC027873@localhost.localdomain> References: <200312301310.hBUDAHOC027873@localhost.localdomain> Message-ID: <1072797788.9216.57.camel@anthem> On Tue, 2003-12-30 at 08:10, Anthony Baxter wrote: > In any case, the 2.3 branch is in feature freeze now (has been for > quite some time) so it's not likely that this sort of new functionality > is acceptable on the 2.3 branch. > > Anthony (wearing the harsh release manager hat). I know that the branch is current frozen waiting for Jack's thaw once the Mac version of 2.3.3 is finished. So there's no way this will make it into the tree before the end of the year, which is my own self-imposed deadline for Mailman 2.1.4. No matter; I've reverted the change in Mailman so we won't be shipping CJKCodecs. But I do still think this is an appropriate patch for Python 2.3.x, since it really isn't a new feature. This change should be appropriate whether you continue to use the old (and unsupported) Korean and Chinese codecs, with the alternative (and supported) Japanese codec, or whether you decide to use the combined CJKCodecs package. At its heart the patch actually removes unnecessary dependencies on the separate Asian codec packages. Since they all provide aliases, this will make the Charset.py file independent of the codec package being used. As soon as Jack thaws the release23-maint branch, I think this patch should go in. I intend to apply it to the head for 2.4 now that the last regression has been fixed. -Barry From anthony at interlink.com.au Tue Dec 30 22:58:11 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Dec 30 22:58:43 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <1072797788.9216.57.camel@anthem> Message-ID: <200312310358.hBV3wBZn007978@localhost.localdomain> >>> Barry Warsaw wrote > But I do still think this is an appropriate patch for Python 2.3.x, > since it really isn't a new feature. This change should be appropriate > whether you continue to use the old (and unsupported) Korean and Chinese > codecs, with the alternative (and supported) Japanese codec, or whether > you decide to use the combined CJKCodecs package. At its heart the > patch actually removes unnecessary dependencies on the separate Asian > codec packages. Since they all provide aliases, this will make the > Charset.py file independent of the codec package being used. I guess the deciding thing (for me) is that code written to use Python 2.3.4 (and the new codec work) should work on Python 2.3.x (x<4). I really don't want to see another repeat of the 2.2.2 fiasco (where code written for 2.2.2 wouldn't work on 2.2.1 or 2.2, because of the new True/False objects). I've seen far, far too much code that's had to do try: True, False except: True = 1 False = 0 Anthony -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Wed Dec 31 09:01:54 2003 From: barry at python.org (Barry Warsaw) Date: Wed Dec 31 09:02:01 2003 Subject: [Email-SIG] Re: [I18n-sig] Re: [Mailman-Developers] Re: [Mailman-checkins] mailman/misc CJKCodecs-1.0.tar.gz, NONE, 1.1.2.1 .cvsignore, 2.2, 2.2.2.1 Makefile.in, 2.33.2.3, 2.33.2.4 paths.py.in, 2.6, 2.6.2.1 JapaneseCodecs-1.4.9.tar.gz, 2.1, NONE KoreanCodecs-2.0.5.tar.gz, 2.1, NONE In-Reply-To: <200312310358.hBV3wBZn007978@localhost.localdomain> References: <200312310358.hBV3wBZn007978@localhost.localdomain> Message-ID: <1072879314.28895.238.camel@anthem> On Tue, 2003-12-30 at 22:58, Anthony Baxter wrote: > >>> Barry Warsaw wrote > > But I do still think this is an appropriate patch for Python 2.3.x, > > since it really isn't a new feature. This change should be appropriate > > whether you continue to use the old (and unsupported) Korean and Chinese > > codecs, with the alternative (and supported) Japanese codec, or whether > > you decide to use the combined CJKCodecs package. At its heart the > > patch actually removes unnecessary dependencies on the separate Asian > > codec packages. Since they all provide aliases, this will make the > > Charset.py file independent of the codec package being used. > > I guess the deciding thing (for me) is that code written to use Python > 2.3.4 (and the new codec work) should work on Python 2.3.x (x<4). I > really don't want to see another repeat of the 2.2.2 fiasco (where > code written for 2.2.2 wouldn't work on 2.2.1 or 2.2, because of the > new True/False objects). I've seen far, far too much code that's had > to do > > try: > True, False > except: > True = 1 > False = 0 Since I don't actually use the codecs, except in the context of Mailman and even then I couldn't tell you what all those pretty graphics mean, I think we have to ultimately defer to the experts. But I don't /think/ its nearly as bad as this. This change is useful even if you are using the older codecs and decide to stick with them. They define the necessary aliases to make this all work, so the dependencies on the japanese and korean package names aren't necessary. -Barry