handling of non-ASCII filenames?

Ulli Horlacher framstag at rus.uni-stuttgart.de
Wed Nov 18 11:45:49 EST 2015


I have written a program (Python 2.7) which reads a filename via
tkFileDialog.askopenfilename() (was a good hint here, other thread).

This filename may contain non-ASCII characters (German Umlauts).

In this case my program crashes with:

  File "S:\python\fexit.py", line 1177, in url_encode
      u += '%' + c.encode("hex").upper()
  File "C:\Python27\lib\encodings\hex_codec.py", line 24, in hex_encode
    output = binascii.b2a_hex(input)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0: ordinal not in range(128)


This is my encoding function:

def url_encode(s):
  u = ''
  for c in list(s):
    if match(r'[_=:,;<>()+.\w\-]',c): 
      u += c
    else:
      u += '%' + c.encode("hex").upper()
  return u



As I am Python newbie I have not quite understood the Python character
encoding scheme :-}

Where can I find a good introduction of this topic?

I would also appreciate a concrete solution for my problem :-)

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher at tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/



More information about the Python-list mailing list