[New-bugs-announce] [issue13717] print fails on unicode '\udce5' surrogates not allowed
Atle Pedersen
report at bugs.python.org
Thu Jan 5 21:12:55 CET 2012
New submission from Atle Pedersen <atle.pedersen at gmail.com>:
I've made a short program to traverse file tree and print file names.
for root, dirs, files in os.walk(path):
for f in files:
hex = ' '.join(["%02X"%ord(x) for x in f])
print('file is',hex,f)
This fails with the following file:
file is 67 72 DCE5 6B 61 6C 6C 65 6E 2E 6A 70 67 2E 68 74 6D 6C Traceback (most recent call last):
File "/home/atle/bin/findpictures.py", line 16, in <module>
print('file is',hexa,f)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce5' in position 2: surrogates not allowed
I don't really understand the issue, but this works with Python 2, and fails using 3.1.4 (gentoo: dev-lang/python-3.1.4-r3)
Same code using Python 2.7.2 gives:
('file is', '67 72 E5 6B 61 6C 6C 65 6E 2E 6A 70 67 2E 68 74 6D 6C', 'gr\xe5kallen.jpg.html')
----------
components: Unicode
messages: 150684
nosy: Atle.Pedersen, ezio.melotti
priority: normal
severity: normal
status: open
title: print fails on unicode '\udce5' surrogates not allowed
type: behavior
versions: Python 3.1
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13717>
_______________________________________
More information about the New-bugs-announce
mailing list