Unicode File Names

John Machin sjmachin at lexicon.net
Fri Oct 17 01:49:15 EDT 2008


On Oct 17, 2:56 pm, Jordan <jordan.tayl... at gmail.com> wrote:
> I'm not quite sure now if the problem is me, windows, or zipfile
> (which I kinda failed to mention before). Using
> os.listdir(unicode(os.listdir()))

You mean os.listdir(unicode(os.getcwd())), I presume.


> seems to have been a step in the
> right direction (thanks Chris and John). When testing things in the
> python interpreter, I don't seem to hit issues after using the above
> mentioned line.
>
> [code]>>> l = os.listdir(unicode(os.getcwd()))
> >>> l
>
> u'01-\u3072\u3089\u304c\u306a.jpg'
> u'02-\u3072\u3089\u304c\u306a.jpg'
> u'03-\u3072\u3089\u304c\u306a.jpg'
>
> >>>for thing in l:
>
> ...    print thing
> 01-ひらがな.jpg
> 02-ひらがな.jpg
> 03-ひらがな.jpg
>
> [/code]
> Yay.
>
> Having a file that tries "for thing in l: print thing" fails with:

>
>   File "C:\Python25\Lib\encodings\cp437.py", line 12, in encode
>     return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode characters in
> position 13-16: character maps to <undefined>
>
> I'm perfectly willing to let command prompt refuse to print that (it's
> debugging only) if the next issue was resolved >_>:

use print repr(thing) for debugging.

>
> """
> Note: There is no official file name encoding for ZIP files. If you
> have unicode file names, please convert them to byte strings in your
> desired encoding before passing them to write(). WinZip interprets all
> file names as encoded in CP437, also known as DOS Latin.
> """
>
> I'm simply not sure what this means and how to deal with it.

Step 1:
Read appendix D of http://www.pkware.com/documents/casestudies/APPNOTE.TXT

Step 2:
Note the change history at the start of that document:
"""
6.3.0         -Added tape positioning storage          09/29/2006
               parameters
[snip]
              -Added option for Unicode filename
               storage
"""

Step 3: Read http://bugs.python.org/issue1734346

Step 4: Either wait for Python 2.7 or apply the patch to your own copy
of zipfile ...



More information about the Python-list mailing list