Unicode File
Martin von Loewis
loewis at informatik.hu-berlin.de
Wed Aug 1 05:46:01 EDT 2001
Tksee Lhsoh <tksee at yahoo.com> writes:
> How do you know the encoding of a file ?
>
> Is it specific to the application which generated the file?
Indeed it is. Sometimes, some higher-level protocol will tell you,
e.g. the charset= attribute in a MIME Content-Type:. In other cases,
the encoding is given inside the file, e.g. the encoding= attribute in
an <?xml heading.
In most cases, you just have to know; if you don't, you could try to
guess the encoding based on the file contents. That is typically a
non-trivial algorithm, though, unless you know that the encoding must
be one out of a few.
Regards,
Martin
More information about the Python-list
mailing list