Ascii Encoding Error with UTF-8 encoder

Serge Orlov serge.orlov at gmail.com
Tue Jun 27 22:35:54 EDT 2006


On 6/27/06, Mike Currie <dev at null.com> wrote:
> Thanks for the thorough explanation.
>
> What I am doing is converting data for processing that will be tab (for
> columns) and newline (for row) delimited.   Some of the data contains tabs
> and newlines so, I have to convert them to something else so the file
> integrity is good.

Usually it is done by escaping: translate tab -> \t, new line -> \n,
back slash -> \\.
Python strings already have a method to do it in just one line:
>>> s=chr(9)+chr(10)+chr(92)
>>> print s.encode("string_escape")
\t\n\\

when you're ready to convert it back you call decode("string_escape")


> Not my idea, I've been left with the implementation however.

The idea is actually not bad as long as you know how to cope with unicode.



More information about the Python-list mailing list