encoding confusions

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Thu Mar 29 14:02:32 EDT 2007


In <eugrkh$7n7$1 at foggy.unx.sas.com>, Tim Arnold wrote:

> I have the contents of a file that contains French documentation.
> I've iterated over it and now I want to write it out to a file.
> 
> I'm running into problems and I don't understand why--I don't get how the 
> encoding works.
> My first attempt was just this:
> < snipped code for classes, etc; fname is string, codecs module loaded.>
> < self.contents is the French file's contents as a single string >

What is the type of `self.contents`, `str` or `unicode`?  You *decode*
strings to unicode objects and you *encode* unicode objects to strings. 
It doesn't make sense to encode a string in 'latin-1' because it must be
decoded first and the "automatic" decoding assumes ASCII and barfs if
there's something non-ascii in the string.

Ciao,
	Marc 'BlackJack' Rintsch




More information about the Python-list mailing list