opening files with names in non-english characters.

venutaurus539 at gmail.com venutaurus539 at gmail.com
Mon Feb 23 23:20:30 EST 2009


On Feb 24, 8:29 am, "venutaurus... at gmail.com"
<venutaurus... at gmail.com> wrote:
> On Feb 23, 11:02 pm, Chris Rebert <c... at rebertia.com> wrote:
>
>
>
> > On Mon, Feb 23, 2009 at 5:51 AM, venutaurus... at gmail.com
>
> > <venutaurus... at gmail.com> wrote:
> > > Hi all,
> > >          I am trying to find the attributes of afile whose name has
> > > non english characters one like given below. When I try to run my
> > > python scirpt, it fails giving out an error filename must be in string
> > > or UNICODE. When i try to copy the name of the file as a strinig, it
> > > (KOMODO IDE) is not allowing me to save the script saying that it
> > > cannot convert some of the characters in the current encoding which is
> > > Western European(CP-1252).
>
> > > 0010testUnicode_ėíîïðņōóôõöũøųúûüýþĸ !#$%&'()+,-.
> > > 0123456789;=... at ABCD.txt.txt
>
> > (1) How are you entering or retrieving that filename?
> > (2) Please provide the exact error and Traceback you're getting.
>
> > Cheers,
> > Chris
>
> > --
> > Follow the path of the Iguana...http://rebertia.com
>
> Hello,
>         First of all thanks for your response. I've written a function
> as shown below to recurse a directory and return a file based on the
> value of n. I am calling this fucntion from my main code to catch that
> filename. The folder which it recurses through contains a folder
> having files with unicode names (as an example i've given earlier.
> --------------------------------------------------------------------------- --
> def findFile(dir_path):
>     for name in os.listdir(dir_path):
>         full_path = os.path.join(dir_path, name)
>         print full_path
>         if os.path.isdir(full_path):
>             findFile(full_path)
>         else:
>             n = n - 1
>             if(n ==0):
>                 return full_path
> --------------------------------------------------------------------------- -----------------------
>                     The problem is in the return statement. In the
> function when I tried to print the file name, it is printing properly
> but the receiving variable is not getting populated with the file
> name. The below code (1st statement) shows the value of the full_path
> variable while the control is at the return statement. The second
> statement is in the main code from where the function call has been
> made.
> Once the control has reached the main procedure after executing the
> findFile procedure, the third statement gives the status of file
> variable which has type as NoneType and value as None. Now when I try
> to check if the path exists, it fails giving the below trace back.
>
> --------------------------------------------------------------------------- --------------------------------------------------------------------------- ----------------------------
> E:\DataSet\Unicode\UnicodeFiles_8859\001_0006_test_folder
> \0003testUnicode_ÍÎIÐNOKÔÕÖ×ØUÚÛÜUUßaáâãäåæicéeëeíîidnokôõö÷øuúûüuu.txt.txt
> --------------------------------------------------------------------------- --------------------------------------------------------------------------- ------------------
> file = findFile(fpath)
> --------------------------------------------------------------------------- --------------------------------------------------------------------------- -----------------------------------------
> file
> NoneType
> None
>
> --------------------------------------------------------------------------- --------------------------------------------------------------------------- -----------------------------------------
> This is the final trace back:
>
> Traceback (most recent call last):
>   File "C:\RecallStubFopen.py", line 268, in <module>
>     if os.path.exists(file):
>   File "C:\Python26\lib\genericpath.py", line 18, in exists
>     st = os.stat(path)
> TypeError: coercing to Unicode: need string or buffer, NoneType found
>
> --------------------------------------------------------------------------- --------------------------------------------------------------------------- ---------------------------------------------
>
> Please ask if you need any further information.
>
> Thank you,
> Venu

To add to what I've said above...I tried manual conversion of the file
using unicode() function which is throwing this error:
Traceback (most recent call last):
  File "C:\RecallStubFopen.py", line 278, in <module>
    file = unicode(file)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xeb in position
74: ordinal not in range(128)









More information about the Python-list mailing list