[Tutor] bogus characters in a windows file
Peter Otten
__peter__ at web.de
Thu Feb 9 10:15:22 CET 2012
Garry Willgoose wrote:
> I input the data with the lines
>
> infile = open('c:\cpu.txt','r')
> infile.readline()
> infile.readline()
> infile.readline()
>
> the readline()s yield the following output
>
> '\xff\xfeP\x00r\x00o\x00c\x00e\x00s\x00s\x00I\x00d\x00 \x00 \x00\r\x00\n'
> '\x000\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n'
> '\x004\x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00 \x00\r\x00\n'
You were already told that you are trying to read a UTF-16-encoded file.
Here's how to deal with that:
>>> import codecs
>>> with codecs.open("cpu.txt", "rU", encoding="UTF-16") as f:
... for line in f:
... print line.rstrip("\n")
...
ProcessId
0
4
More information about the Tutor
mailing list