[issue43333] utf8 in BytesGenerator

Chris report at bugs.python.org
Thu Feb 24 08:20:32 EST 2022


Chris <chrisstaunton1990 at gmail.com> added the comment:

found this issue while googling the error. Also having the same problem with as_bytes() breaking on non-ascii characters. 

I've tried policy=policy.default.clone(utf8=True) but it gives the same error. 

My sample.py file attached contains a string sample email - which has a character \u200d (https://unicode-table.com/en/200D/) - Zero Width Joiner in the body. 

UnicodeEncodeError: 'ascii' codec can't encode character '\u200d' in position 70: ordinal not in range(128)

Any assistance on what I can do to solve it would be great. It seems I can parse 99% of the emails I've tried but this one has me confused.

----------
nosy: +chrisstaunton1990
Added file: https://bugs.python.org/file50641/sample.py

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43333>
_______________________________________


More information about the Python-bugs-list mailing list