[issue12057] HZ codec has no test

Hyeshik Chang report at bugs.python.org
Thu May 12 02:10:00 CEST 2011


Hyeshik Chang <hyeshik at gmail.com> added the comment:

Hello, everyone!

The rationale why I chose to encode the test strings into a Python source code was that I wanted for them to be treated as text files which are trackable in CVS or subversion and to keep Python source codes free of any non-ASCII characters. Now I don't feel the need of "text file" status, STINNER's suggestion works for me.

Actually, all "stateful" encodings supported by cjkcodecs lack of adequate test codes. (There are seven more iso-2022 stateful encodings in addition of hz in Python.)  "cjkencoding_tests.py" is used for random chunk coding tests and most stateful encodings are not compatible with random chunk coding. For those reasons, I didn't include test strings for them there. But they apparently still need appropriate simple string coding and stream coding tests.

STINNER Victor wrote:
> I don't understand why different texts are used. Why not just using the
> same original text for all testcases? One reason can be that some
> encodings (e.g. ISO 2202) use escape sequences to change the current
> encoding. Or maybe because the characters are different (chinese vs
> japanese characters?).

Almost every encoding in cjkcodecs has different set of characters. They support different languages (Chinese, Japanese, Korean), different scripts (Hanja, Kanji, Traditional and Simplified Chinese), different standards (johab and KS X 1001 in Korean), different versions/variants (JIS X 0201 and JIS X 0213 in Japanese).  It would be quite striking, actually one of them, gb18030, is a "superset" of the Unicode so far.


Teddy J Reedy wrotes:
> Perhaps there should be a separate test like the above to be sure that hz really uses GB2312-80, as specified.

You're right.


By the way, my previous e-mail address <perky at FreeBSD.org> isn't reachable anymore, please send to <hyeshik at gmail.com> when you need.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12057>
_______________________________________


More information about the Python-bugs-list mailing list