getting data with proper encoding to the finish

Mon Mar 14 20:27:41 EST 2005

John Machin wrote:
> Ksenia Marasanova wrote:
>> Sorry, I meant: I use field of the type 'text' in a Postgres table to
>> store my data. The data is a XML string.
>>
>>> Instead of "print data", do "print repr(data)" and show us what you
>>> get. What *you* see on the screen is not much use for diagnosis;
>>> it's the values of the bytes in the file that matter.
>>
>> Thanks for this valuable tip. I take letter "é" as an example.
>>
>> "print repr(data)" shows this:
>> u'\xe9'
>
> That doesn't look like an "XML string" to me. Show the WHOLE contents
> of the field.
>
> Have you read the docs of the Perl module of which pyXLWrtiter is a
> docs-free port? Right down the end it mutters something about XML
> parsers returning UTF8 which will jam up the works if fed into an
> Excel spreadsheet ...

Looking at the following function in pyXLWriter
def _asc2ucs(s):
    """Convert ascii string to unicode."""
    return "\x00".join(s) + "\x00"

I can guess several things:
a) pyXLWriter author is an ascii guy :)
b) unicode strings are not supported by pyXLWriter
c) excel keeps unicode text in utf-16le

Ksenia, try encoding unicode strings in utf-16le before passing them to
pyXLWriter . If that doesn't work that means pyXLWriter requires
changes to support unicode strings.

  Serge.