Determining encoding of a file

Tony Houghton h at realh.co.uk
Sat Feb 3 19:31:25 EST 2007


In <mailman.3514.1170536627.32031.python-list at python.org>,
Ben Finney <bignose+hates-spam at benfinney.id.au> wrote:

> Tony Houghton <h at realh.co.uk> writes:
>
>> In Linux it's possible for filesystems to have a different encoding
>> from the system's setting. Given a filename, is there a (preferably)
>> portable way to determine its encoding?
>
> If there were, PEP 263 would not be necessary.
>
>     <URL:http://www.python.org/dev/peps/pep-0263/>
>
> It's possible to *guess*, with no guarantee of getting the right
> answer; but it's far better to be explicitly *told* what the encoding
> is.

That seems to be specific to the encoding used in py source files
anyway. What I want to be able to do is guess the encoding of any file
for loading into a text editor based on gtksourceview which is pure
utf-8. The best I can do is assume it's in the system encoding with
locale.getdefaultlocale()[1]. Come to think of it, I wouldn't really be
any better off knowing if the filesystem has a diferent encoding anyway
because it doesn't necessarily determine what's used in the contents of
its files, only its filenames. And Linux at least seems to be able to
translate those on the fly.

-- 
TH * http://www.realh.co.uk



More information about the Python-list mailing list