base64 and unicode
EuGeNe Van den Bulke
eugene.vandenbulke at gmail.com
Fri May 4 05:47:40 EDT 2007
Duncan Booth wrote:
> However, the decoded text looks as though it is utf16 encoded so it should be written as binary. i.e.
> the output mode should be "wb".
Thanks for the "wb" tip that works (see bellow). I guess it is
experience based but how could you tell that it was utf16 encoded?
> Simpler than using the base64 module you can just use the base64 codec.
> This will decode a string to a byte sequence and you can then decode that
> to get the unicode string:
>
> with file("hebrew.b64","r") as f:
> text = f.read().decode('base64').decode('utf16')
>
> You can then write the text to a file through any desired codec or process
> it first.
>>> with file("hebrew.lang","wb") as f:
>>> ... file.write(text.encode('utf16'))
Done ... superb!
> BTW, you may just have shortened your example too much, but depending on
> python to close files for you is risky behaviour. If you get an exception
> thrown before the file goes out of scope it may not get closed when you
> expect and that can lead to some fairly hard to track problems. It is much
> better to either call the close method explicitly or to use Python 2.5's
> 'with' statement.
Yes I had shortened my example but thanks for the 'with' statement tip
... I never think about using it and I should ;)
Thanks,
EuGeNe -- http://www.3kwa.com
More information about the Python-list
mailing list