UTF-8 usage in Python 2.0
François Granger
francois.granger at free.fr
Fri Oct 27 17:52:25 EDT 2000
I am participating in the translation of Python docs to french. I works
on Mac (at home). Since this doc is to be delivered in iso-8859-1,
grabbing from other scripts, I came with a simple script wich translate
my Mac charset to 8859.
def macTo88591(s):
t=""
for c in s:
if entitydefs.has_key(c):
t=t + entitydefs[c]
else:
t=t+c
return t
The entitydefs is a hacked version of """HTML character entity
references.""" with Mac char replacing html entities as keys.
On the professional side, I receive html files translated to french.
They are coded in html entity. I need to translate them to UTF-8. I
currently use Tidy to do this, but I need to do some manual
modifications after it.
I looked throught the new features of Python 2 but I did not found an
easy way to do something similar to what I did with this 8859
modification.
Any idea or bigginning of solution ?
--
"La connaissance est le chemin de la tolérance, c'est valable pour
tous, en toutes saisons."
- Raymond Page
More information about the Python-list
mailing list