UnicodeDecodeError issue

Ferrous Cranus nikos.gr33k at gmail.com
Wed Sep 4 04:35:06 EDT 2013


Τη Δευτέρα, 2 Σεπτεμβρίου 2013 9:28:36 μ.μ. UTC+3, ο χρήστης Dave Angel έγραψε:
> On 2/9/2013 11:05, Ferrous Cranus wrote:
> 
> 
> 
> > Στις 2/9/2013 3:21 μμ, ο/η Dave Angel έγραψε:
> 
> >> Starting with the byte string in the error message:
> 
> >>>>> f = open("junk.txt", "w")
> 
> >>>>> f.write(b'\xb6\xe3\xed\xf9\xf3\xf4\xef\xfc\xed\xef\xec\xe1 \xf3\xf5\xf3\xf4\xde\xec\xe1\xf4\xef\xf2\n')
> 
> >>>>> f.close()
> 
> >
> 
> >
> 
> > Ιndeed but yet again, file checks out the encoding of the filename that 
> 
> > consists of these lines above, not of the actual strings.
> 
> >
> 
> >
> 
> 
> 
> 'file' does nothing interesting with the filename, it just opens it and
> 
> examines the contents.  For example,
> 
> 
> 
> file www/cgi-bin/files.py
> 
> 
> 
> will examine the Python source file, not run it.
> 
> 
> 
> So first in the interpreter, I ran
> 
> 
> 
> >>>> f = open("junk.txt", "w")
> 
> >>>> f.write(b'\xb6\xe3\xed\xf9\xf3\xf4\xef\xfc\xed\xef\xec\xe1 \xf3\xf5\xf3\xf4\xde\xec\xe1\xf4\xef\xf2\n')
> 
> >>>> f.close()
> 
> 
> 
> then at the bash prompt, I ran:
> 
> 
> 
> davea at think2:~$ file junk.txt 
> 
> junk.txt: ISO-8859 text


That is one Clever Idea Dave.

I take it that the charset of the file 'junk.txt' gets identified by the characters encoding that read form within the file?

But wait a minute: What editor do you uses to write these 3 lines?
I mean am a bit confused.

i for example i 'nano tets.py' which has within:

f = open("junk.txt", "w") 
f.write(b'\xb6\xe3\xed\xf9\xf3\xf4\xef\xfc\xed\xef\xec\xe1 \xf3\xf5\xf3\xf4\xde\xec\xe1\xf4\xef\xf2\n') 
f.close() 

then when i save the file within nano for example by default in utf-8 charset

how would it be able to detect the bytestring within that is supposed to be of greek-iso's



More information about the Python-list mailing list