How do I automate the removal of all non-ascii characters from my code?

jmfauth wxjmfauth at gmail.com
Mon Sep 12 10:39:22 EDT 2011


On 12 sep, 10:17, Gary Herron <gher... at islandtraining.com> wrote:
> On 09/12/2011 12:49 AM, Alec Taylor wrote:
>
>
>
> > Good evening,
>
> > I have converted ODT to HTML using LibreOffice Writer, because I want
> > to convert from HTML to Creole using python-creole. Unfortunately I
> > get this error: "File "Convert to Creole.py", line 17
> > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py
> > on line 18, but no encoding declared; see
> >http://www.python.org/peps/pep-0263.htmlfor details".
>
> > Unfortunately I can't post my document yet (it's a research paper I'm
> > working on), but I'm sure you'll get the same result if you write up a
> > document in LibreOffice Writer and add some End Notes.
>
> > How do I automate the removal of all non-ascii characters from my code?
>
> > Thanks for all suggestions,
>
> > Alec Taylor
>

The coding of the characters is a domain per se.
It is independent from any OS's or applications.

When working with (plain) text files, you should
always be aware about the coding of the text you
are working on. If you are using coding directives,
you must ensure your coding directive matches
the real coding of the text files. A coding
directive is only informative, it does not set
the coding.

I'm pretty sure, you problem comes from this. There
is a mismatch somewhere, you are not aware of.
Removing ascii chars is certainly not a valuable
solution. It must work. If your are working
properly, it can not, not work.

Frome a linguistic point of view, the web has informed
me Creole (*all the Creoles*) can be composed with
the iso-8859-1 coding. That means, iso-8859-1, cp1252 and
all Unicode coding variants are possible coding directives.

jmf




More information about the Python-list mailing list