One more ques Re: [XML-SIG] I am confused...

Martin v. Loewis martin@mira.cs.tu-berlin.de
Mon, 29 Jan 2001 23:50:00 +0100


> > And one more problem: my texts are far from plain ASCII.
> > Do I need to convert them to utf8 or unicode before
> > working with XML+XSLT+XPath?
> > Do I need Python-2 to implement non US-ASCII site (and not latin-1)?
> 
> It would certainly make life easier, but you should be able to use 1.5.2

Depending on the exact software package you are going to use, and the
exact encoding that your documents have, it may or may not work. For
example, expat only knows about Latin-1 and UTF-8. In Python 2, it
will have access to the Python codecs, but they are not present in
1.5.2.

If you use drv_xmllib, and later when you produce output, the list of
supported encodings (from xml.unicode) is somewhat longer, but still
limited. E.g. ISO-8859-5 is supported, KOI-8R is not; that would easy
to add, though.

Since they perform to-utf8 conversion anyway, it is probably best to
recode to UTF-8 for 1.5.2 before parsing. Make sure that the recoding
drops or changes any encoding= attribute in the xml header, though.

Maybe you want to make an entire UTF-8 site :-? Many browsers display
that fine these days, in my experience.

Regards,
Martin