Ascii Encoding Error with UTF-8 encoder
John Machin
sjmachin at lexicon.net
Tue Jun 27 20:25:30 EDT 2006
On 28/06/2006 9:44 AM, Mike Currie wrote:
>
> What I am doing is converting data for processing that will be tab (for
> columns) and newline (for row) delimited. Some of the data contains tabs
> and newlines so, I have to convert them to something else so the file
> integrity is good.
>
> Not my idea, I've been left with the implementation however.
>
Do you *need* UTF-8? Or is that only there to hide away the \x88 and
\x83? Apart from tab and linefeed, what (if any) other characters are
there in the data that are not printable ASCII characters?
In any case, if you have 8-bit string data, the CSV file format would
appear to meet the requirement: it preserves your data by "quoting"
delimiters and newlines that appear in the actual data. The Python csv
module is included in every Python distribution since 2.3.
Cheers,
John
More information about the Python-list
mailing list