trying to strip out non ascii.. or rather convert non ascii

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Oct 26 18:24:43 EDT 2013


On Sat, 26 Oct 2013 16:11:25 -0400, bruce wrote:

> hi..
> 
> getting some files via curl, and want to convert them from what i'm
> guessing to be unicode.
> 
> I'd like to convert a string like this:: <div class="profName"><a
> href="ShowRatings.jsp?tid=1312168">Alcántar, Iliana</a></div>
> 
> to::
> <div class="profName"><a href="ShowRatings.jsp?tid=1312168">Alcantar,
> Iliana</a></div>
> 
> where I convert the
> " á " to " a"

Why on earth would you want to throw away perfectly good information? 
It's 2013, not 1953, and if you're still unable to cope with languages 
other than English, you need to learn new skills.

(Actually, not even English, since ASCII doesn't even support all the 
characters used in American English, let alone British English. ASCII was 
broken from the day it was invented.)

Start by getting some understanding:

http://www.joelonsoftware.com/articles/Unicode.html


Then read this post from just over a week ago:

https://mail.python.org/pipermail/python-list/2013-October/657827.html



-- 
Steven



More information about the Python-list mailing list