Making IDLE3 ignore non-BMP characters instead of throwing an exception?

Adam Funk a24061 at ducksburg.com
Mon Oct 17 14:03:37 EDT 2016


On 2016-10-17, Adam Funk wrote:

> I'm using IDLE 3 (with python 3.5.2) to work interactively with
> Twitter data, which of course contains emojis.  Whenever the running
> program tries to print the text of a tweet with an emoji, it barfs
> this & stops running:
>
>   UnicodeEncodeError: 'UCS-2' codec can't encode characters in
>   position 102-102: Non-BMP character not supported in Tk
>
> Is there any way to set IDLE to ignore these characters (either drop
> them or replace them with something else) instead of throwing the
> exception?
>
> If not, what's the best way to strip them out of the string before
> printing?

Well, to answer part of my own question, this works for stripping them
out:

     s = ''.join([c for c in s if ord(c)<65535])



-- 
Master Foo said: "A man who mistakes secrets for knowledge is like
a man who, seeking light, hugs a candle so closely that he smothers
it and burns his hand."                            --- Eric Raymond



More information about the Python-list mailing list