Python under PowerShell adds characters

eryk sun eryksun at gmail.com
Wed Mar 29 14:19:46 EDT 2017


On Wed, Mar 29, 2017 at 5:42 PM, Jay Braun <lyngwyst at gmail.com> wrote:
>
> I'm not using ISE.  I'm using a pre-edited script, and running it with the python command.
>
> Consider the following simple script named hello.py (Python 2.7):
>
> print "Hello"
>
> If I enter:
> python hello.py > out.txt
>
> from cmd.exe I get a 6-character file (characters plus new-line).
> from PowerShell I get an extract ^@ character after every character

You didn't say you were redirecting the output to a file. That's a
completely different story for PowerShell -- and far more frustrating.

cmd.exe implements redirecting a program's output to a file by
temporarily changing its own StandardOutput to the file; spawing the
process, which inherits the StandardOutput handle; and then changing
back to its original StandardOutput (typically a console screen
buffer). The program can write whatever it wants to the file, and cmd
isn't involved in any way.

PowerShell is far more invasive. Instead of giving the child process a
handle for the file, it gives it a handle for a *pipe*. PowerShell
reads from the pipe, and like an annoying busybody that no asked for,
decodes the output as text, processes it (e.g. replacing newlines),
and writes the processed data to the file. For example:

    PS C:\Temp> $script = "import sys; sys.stdout.buffer.write(b'\n')"
    PS C:\Temp> python -c $script > test.txt
    PS C:\Temp> python -c "print(open('test.txt', 'rb').read())"
    b'\xff\xfe\r\x00\n\x00'

I wrote a single byte, b'\n', but PowerShell decoded it, replaced "\n"
with "\r\n", and wrote it as UTF-16 with a BOM.



More information about the Python-list mailing list