How do I automate the removal of all non-ascii characters from my code?

Rhodri James rhodri at wildebst.demon.co.uk
Mon Sep 12 17:39:41 EDT 2011


On Mon, 12 Sep 2011 15:47:00 +0100, jmfauth <wxjmfauth at gmail.com> wrote:

> On 12 sep, 10:49, Steven D'Aprano <steve
> +comp.lang.pyt... at pearwood.info> wrote:
>>
>> Even with a source code encoding, you will probably have problems with
>> source files including \xe2 and other "bad" chars. Unless they happen to
>> fall inside a quoted string literal, I would expect to get a  
>> SyntaxError.
>>
>
> This is absurd and a complete non sense. The purpose
> of a coding directive is to inform the engine, which
> is processing a text file, about the "language" it
> has to speak. Can be a html, py or tex file.
> If you have problem, it's probably a mismatch between
> your coding directive and the real coding of the
> file. Typical case: ascii/utf-8 without signature.

Now read what Steven wrote again.  The issue is that the program contains  
characters that are syntactically illegal.  The "engine" can be perfectly  
correctly translating a character as a smart quote or a non breaking space  
or an e-umlaut or whatever, but that doesn't make the character legal!

-- 
Rhodri James *-* Wildebeest Herder to the Masses



More information about the Python-list mailing list