Python under PowerShell adds characters

Chris Angelico rosuav at gmail.com
Thu Mar 30 01:48:59 EDT 2017


On Thu, Mar 30, 2017 at 4:43 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
> The input is not in my control, and bailing out may not be an option:
>
>    $ echo
> aa\n\xdd\naa' | grep aa
>    aa
>    aa
>    $ echo \xdd' | python2 -c 'import sys; sys.stdin.read(1)'
>    $ echo \xdd' | python3 -c 'import sys; sys.stdin.read(1)'
>    Traceback (most recent call last):
>      File "<string>", line 1, in <module>
>      File "/usr/lib64/python3.5/codecs.py", line 321, in decode
>        (result, consumed) = self._buffer_decode(data, self.errors, final)
>    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdd in position 0:
>     invalid continuation byte
>
> Note that "grep" is also locale-aware.

So what exactly does byte value 0xDD mean in your stream?

And if you say "it doesn't matter", then why are you assigning meaning
to byte value 0x0A in your first example? Truly binary data doesn't
give any meaning to 0x0A.

ChrisA



More information about the Python-list mailing list