Unicode File

Martin von Loewis loewis at informatik.hu-berlin.de
Wed Aug 1 05:46:01 EDT 2001


Tksee Lhsoh <tksee at yahoo.com> writes:

> How do you know the encoding of a file ?
> 
> Is it specific to the application which generated the file?

Indeed it is. Sometimes, some higher-level protocol will tell you,
e.g. the charset= attribute in a MIME Content-Type:. In other cases,
the encoding is given inside the file, e.g. the encoding= attribute in
an <?xml heading. 

In most cases, you just have to know; if you don't, you could try to
guess the encoding based on the file contents. That is typically a
non-trivial algorithm, though, unless you know that the encoding must
be one out of a few.

Regards,
Martin




More information about the Python-list mailing list