How do I automate the removal of all non-ascii characters from my code?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Sep 12 04:49:56 EDT 2011


On Mon, 12 Sep 2011 06:43 pm Stefan Behnel wrote:

> I'm not sure what you are trying to say with the above code, but if it's
> the code that fails for you with the exception you posted, I would guess
> that the problem is in the "[more stuff here]" part, which likely contains
> a non-ASCII character. Note that you didn't declare the source file
> encoding above. Do as Gary told you.

Even with a source code encoding, you will probably have problems with
source files including \xe2 and other "bad" chars. Unless they happen to
fall inside a quoted string literal, I would expect to get a SyntaxError.

I have come across this myself. While I haven't really investigated in great
detail, it appears to happen when copying and pasting code from a document
(usually HTML) which uses non-breaking spaces instead of \x20 space
characters. All it takes is just one to screw things up.



-- 
Steven




More information about the Python-list mailing list