Unicode File Names
John Machin
sjmachin at lexicon.net
Fri Oct 17 01:49:15 EDT 2008
On Oct 17, 2:56 pm, Jordan <jordan.tayl... at gmail.com> wrote:
> I'm not quite sure now if the problem is me, windows, or zipfile
> (which I kinda failed to mention before). Using
> os.listdir(unicode(os.listdir()))
You mean os.listdir(unicode(os.getcwd())), I presume.
> seems to have been a step in the
> right direction (thanks Chris and John). When testing things in the
> python interpreter, I don't seem to hit issues after using the above
> mentioned line.
>
> [code]>>> l = os.listdir(unicode(os.getcwd()))
> >>> l
>
> u'01-\u3072\u3089\u304c\u306a.jpg'
> u'02-\u3072\u3089\u304c\u306a.jpg'
> u'03-\u3072\u3089\u304c\u306a.jpg'
>
> >>>for thing in l:
>
> ... print thing
> 01-ひらがな.jpg
> 02-ひらがな.jpg
> 03-ひらがな.jpg
>
> [/code]
> Yay.
>
> Having a file that tries "for thing in l: print thing" fails with:
>
> File "C:\Python25\Lib\encodings\cp437.py", line 12, in encode
> return codecs.charmap_encode(input,errors,encoding_map)
> UnicodeEncodeError: 'charmap' codec can't encode characters in
> position 13-16: character maps to <undefined>
>
> I'm perfectly willing to let command prompt refuse to print that (it's
> debugging only) if the next issue was resolved >_>:
use print repr(thing) for debugging.
>
> """
> Note: There is no official file name encoding for ZIP files. If you
> have unicode file names, please convert them to byte strings in your
> desired encoding before passing them to write(). WinZip interprets all
> file names as encoded in CP437, also known as DOS Latin.
> """
>
> I'm simply not sure what this means and how to deal with it.
Step 1:
Read appendix D of http://www.pkware.com/documents/casestudies/APPNOTE.TXT
Step 2:
Note the change history at the start of that document:
"""
6.3.0 -Added tape positioning storage 09/29/2006
parameters
[snip]
-Added option for Unicode filename
storage
"""
Step 3: Read http://bugs.python.org/issue1734346
Step 4: Either wait for Python 2.7 or apply the patch to your own copy
of zipfile ...
More information about the Python-list
mailing list