[Python-Dev] test_unicode_file failing on Mac OS X

Jack Jansen Jack.Jansen at cwi.nl
Sun Dec 7 11:32:23 EST 2003


On 6-dec-03, at 18:48, Skip Montanaro wrote:

> Two of the test_unicode_file began failing on my Mac today (fresh cvs 
> up, OS
> X 10.2.8, vanilla unix-style build):
>
>     
> ======================================================================
>     FAIL: test_directories (__main__.TestUnicodeFiles)
>     
> ----------------------------------------------------------------------
>     Traceback (most recent call last):
>       File "../Lib/test/test_unicode_file.py", line 155, in 
> test_directories
>         self._do_directory(TESTFN_ENCODED+ext, TESTFN_ENCODED+ext, 
> os.getcwd)
>       File "../Lib/test/test_unicode_file.py", line 103, in 
> _do_directory
>         make_name)
>     AssertionError: '@test-a\xcc\x80o\xcc\x80.dir' != 
> '@test-\xc3\xa0\xc3\xb2.dir'

This is probably related to the two flavors of unicode there are, one 
which prefers to have all accents separately from the letters as much 
as possible and one which prefers the reverse. I keep forgetting the 
names of the two, they're somewhat silly.

But the problem is that Python prefers to represent the string "ä" as 
the two characters "a" and "umlaut on the previous char", and MacOSX 
prefers to represent the same string as "a with umlaut on it". Or the 
other way around, this is something else I always forget.

And while there are algorithms to convert the combined form of unicode 
to the uncombined form and vice versa there are no Python codecs to do 
this. The OSX system calls do the right thing (convert both forms to 
what it prefers), but when you do a readdir() you don't get the string 
back you put it.
--
Jack Jansen, <Jack.Jansen at cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman




More information about the Python-Dev mailing list