How to ask sax for the file encoding

"Martin v. Löwis" martin at v.loewis.de
Wed Oct 4 18:46:22 EDT 2006


Irmen de Jong schrieb:
> As others have tried to explain, the encoding in the xml header is
> not part of the document data itself, it says something about the data.
> It would be a bad design decision imo to rely on this meta information
> if you really meant that information to be part of the data document.

A common problem is to save the data in the same encoding that they
original had; this is what an editor typically does (you may know
Edward Ream for writing editors). XML parsers are notoriously bad
in supporting editors. There are too many lexical details that may
need to be preserved (such as the order of the attributes, and the
spaces inside the opening tag) to make it impractical to report all
that to the application.

IMO, the only way to edit XML on a level that does preserving
of the tiniest lexical details is to edit it as plain text
(i.e. without using an XML parser).

Regards,
Martin



More information about the Python-list mailing list