UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte

MRAB python at mrabarnett.plus.com
Thu Jul 4 11:10:56 EDT 2013


On 04/07/2013 14:38, Νίκος Γκρ33κ wrote:
> Στις 4/7/2013 4:34 μμ, ο/η MRAB έγραψε:
>> On 04/07/2013 13:47, Νίκος wrote:
>>> Στις 4/7/2013 3:07 μμ, ο/η MRAB έγραψε:
>>>> On 04/07/2013 12:36, Νίκος wrote:
>>>>> Στις 4/7/2013 2:06 μμ, ο/η MRAB έγραψε:
>>>>>> On 04/07/2013 11:38, Νίκος wrote:
>>>>>>> Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγραψε:
>>>>>>>> Am 04.07.2013 10:37, schrieb Νίκος:
>>>>>>>>> I just started to have this error without changing nothing
>>>>>>>>
>>>>>>>> Well, undo the nothing that you didn't change. ;)
>>>>>>>>
>>>>>>>>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
>>>>>>>>> position 0:
>>>>>>>>> invalid start byte
>>>>>>>>> [Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97]
>>>>>>>>> Premature
>>>>>>>>> end
>>>>>>>>> of script headers: metrites.py
>>>>>>>>>
>>>>>>>>> Why cant it decode the starting byte? what starting byte is that?
>>>>>>>>
>>>>>>>> It's the 0xb6 but it's expecting the starting byte of a UTF-8
>>>>>>>> sequence.
>>>>>>>> Please do some research on UTF-8, that should clear it up. You could
>>>>>>>> also search for common causes of that error.
>>>>>>>
>>>>>>> So you are also suggesting that what gesthostbyaddr() returns is not
>>>>>>> utf-8 encoded too?
>>>>>>>
>>>>>>> What character is 0xb6 anyways?
>>>>>>>
>>>>>> Well, it's from a bytestring, so you'll have to specify what encoding
>>>>>> you're using! (It clearly isn't UTF-8.)
>>>>>>
>>>>>> If it's ISO-8859-7 (what you've previously referred to as
>>>>>> "greek-iso"),
>>>>>> then:
>>>>>>
>>>>>>  >>> import unicodedata
>>>>>>  >>> unicodedata.name(b"\xb6".decode("ISO-8859-7"))
>>>>>> 'GREEK CAPITAL LETTER ALPHA WITH TONOS'
>>>>>>
>>>>>> You'll need to find out where that bytestring is coming from.
>>>>>
>>>>> Right.
>>>>> But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
>>>>> clue where this is coming from.
>>>>>
>>>>> And you are right if it was a byte came from an utf-8 encoding scheme
>>>>> then it would be automatically decoded.
>>>>>
>>>>> The only thing i can say for use is that this problem a[[ear only
>>>>> when i
>>>>> cloudflare my domain "superhost.gr"
>>>>>
>>>>> If i un-cloudlflare it it cease to display errors.
>>>>>
>>>>> Can you tell me hpw to write the following properly:
>>>>>
>>>>> host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or
>>>>> 'UnResolved'
>>>>>
>>>>> so even if the function fails "unresolved" to be returned back?
>>>>> Somehow i need to capture the error.
>>>>>
>>>>> Or it dosnt have to do it the or operand will be returned?
>>>>>
>>>> If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
>>>> 3.3 onwards, is a subclass of OSError), so try catching that, setting
>>>> 'host' to 'UnResolved' if it's raised.
>>>>
>>>> Also, try printing out ascii(os.environ['REMOTE_ADDR']).
>>>>
>>>
>>> I have followed your suggestion by trying this:
>>>
>>> try:
>>>     host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
>>> except socket.gaierror:
>>>     host = "UnResolved"
>>>
>>> and then re-cloudlflared "superhost.gr" domain
>>>
>>> http://superhost.gr/ gives internal server error.
>>>
>> Try catching OSError instead. (As I said, from Python 3.3,
>> socket.gaierror is a subclass of it.)
>>
>
> At least CloudFlare doesn't give me issues:
>
> if i try this:
>
> try:
> 	host = os.environ['REMOTE_ADDR'][0]
> except socket.gaierror:
> 	host = "UnResolved"
>
It's pointless trying to catch a socket exception here because you're
not using a socket, you're just getting a string from an environment
variable.

> then i get no errors and a valid ip back
>
> but the above fails.
>
> I don't know how to catch the exception with OSError.
>
> i know only this two:
>
> except socket.gaierror:
> except socket.herror
>
> both fail.
>
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!




More information about the Python-list mailing list