Python and UTF-8
Dave Pawson
DaveP at dpawsonNOSPam.freeserve.co.uk
Fri Jan 4 13:45:03 EST 2002
martin at v.loewis.de (Martin v. Loewis) wrote in
news:m3itak5to0.fsf at mira.informatik.hu-berlin.de:
> Brandvik <tmagna at online.no> writes:
>
>> Is it possible to make a python script that would change the character
>> to UTF-8 no matter what the encoding of the input is? I have heard
>> that Python has some great functions for Unicode formatting so this
>> might be an easy and trivial task, but I'm new to Python so I really
>> don't know...
>
> You have to know the encoding the data is currently, say
> current_encoding. Then, converting it into UTF-8, you write
>
> data = unicode(data, current_encoding).encode('utf-8')
If, having a file with 8859-1 encodings, can I use the same
approach?
This prior to xslt processing, with older html files
originating in Scandanavia, which blow up when XSLT
gets hold of them with no encoding specified!
I figured out how to add the encoding, but a utf-8 input
would make it far easier!
Regards DaveP
More information about the Python-list
mailing list