Convert a list with wrong encoding to utf8

Piet van Oostrum piet-l at vanoostrum.org
Fri Feb 15 14:46:39 EST 2019


vergos.nikolas at gmail.com writes:

> Τη Πέμπτη, 14 Φεβρουαρίου 2019 - 8:56:31 μ.μ. UTC+2, ο χρήστης MRAB έγραψε:
>
>> It doesn't have a 'b' prefix, so either it's Python 2 or it's a Unicode 
>> string that was decoded wrongly from the bytes.
>
> Yes it doesnt have the 'b' prefix so that hexadecimal are representation of strings and not representation of bytes.
>
> I just tried:
>
> names = tuple( [s.encode('latin1').decode('utf8') for s in names] )
>
> but i get
> UnicodeEncodeError('latin-1', 'Άκης Τσιάμης', 0, 4, 'ordinal not in range(256)')
>
> 'Άκης Τσιάμης' is a valid name but even so it gives an error.
>
> Is it possible that Python3 a Unicode had the string wrongly decoded from the bytes ?
>
> What can i do to get the names?!

python3

>>> x = '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82'
>>> b = bytes(ord(c) for c in x)
>>> b.decode('utf-8')
'Άκης Τσιάμης'
>>> 
-- 
Piet van Oostrum <piet-l at vanoostrum.org>
WWW: http://piet.vanoostrum.org/
PGP key: [8DAE142BE17999C4]



More information about the Python-list mailing list