Problems with unicode
David Opstad
opstad at batnet.com
Sat Mar 13 17:39:33 EST 2004
In article <a091da2f.0403131345.5e82b07e at posting.google.com>,
jamesl at appliedminds.com (James Laamnna) wrote:
> Apparently in the batch that I'm encoding there is one string with
> non-ascii characters in it. Is there any way to just have it encode
> everything as unicode and not ascii?
A better question to ask is this: where did the supposed ASCII data come
from in the first place? If, for instance, it came from a Windows
machine, then there's a chance it's actually ISO-8859-1 encoding, in
which case you can preserve the 0x92 by encoding using that codec,
instead of the 'ascii' one. Similarly, if the original text came from a
Mac, then it's likely in Mac Roman, so if you use the 'mac-roman' codec
you'll be able to preserve the correct character in your resulting
Unicode.
Dave
More information about the Python-list
mailing list