Windows, subprocess.Popen & encodage

Tue May 8 03:15:49 EDT 2007

> But, I don't found anything, in any documentations, on this.
> 
> Sombody can confirm?   Am I misled?  Am I right?

You are right, and you are misled. The encoding of the data
that you get from Popen.read is not under the control of Python:
i.e. not only you don't know, but Python doesn't know, either.
The operating system simply has no mechanism of indicating
what encoding is used on a pipe.

So different processes may chose different encodings. Some
may produce UTF-16, others may produce CP-850, yet others
UTF-8, and so on. There really is no way to tell other than
reading the documentation *of the program you run*, and,
failing that, reading the source code of the program you
run.

On Windows, many programs will indeed use one of the
two system code pages, or UTF-16. It's true that
UTF-16 can be quite reliably detected by looking at the
first two bytes. However, the two system code pages
(OEM CP and ANSI CP) are not so easy to tell apart.

Regards,
Martin