Unicode Chars in Windows Path

Steven D'Aprano steve at pearwood.info
Wed Apr 2 22:37:27 EDT 2014


On Wed, 02 Apr 2014 16:27:04 -0700, Steve wrote:

> Hi All,
> 
> I'm in need of some encoding/decoding help for a situation for a Windows
> Path that contains Unicode characters in it.
> 
> ---- CODE ----
> 
> import os.path
> import codecs
> import sys
> 
> All_Tests =
> [u"c:\automation_common\Python\TestCases\list_dir_script.txt"]

I don't think this has anything to do with Unicode encoding or decoding. 
In Python string literals, the backslash makes the next character 
special. So \n makes a newline, \t makes a tab, and so forth. Only if the 
character being backslashed has no special meaning does Python give you a 
literal backslash:

py> print("x\tx")
x	x
py> print("x\Tx")
x\Tx


In this case, \a has special meaning, and is converted to the ASCII BEL 
control character:

py> u"...\automation"
u'...\x07utomation'


When working with Windows paths, you should make a habit of either 
escaping every backslash:

    u"c:\\automation_common\\Python\\TestCases\\list_dir_script.txt"

using a raw-string:

    ur"c:\automation_common\Python\TestCases\list_dir_script.txt"

or just use forward slashes:

    u"c:/automation_common/Python/TestCases/list_dir_script.txt"


Windows accepts both forward and backslashes in file names.


If you fix that issue, I expect your problem will go away.



-- 
Steven



More information about the Python-list mailing list