ascii character - removing chars from string

bruce bedouglas at earthlink.net
Mon Jul 3 23:26:13 EDT 2006


simon...

the ' ' is not to be seen/viewed as text/ascii.. it's a representation
of a hex 'u\xa0' if i recall...

i'm looking to remove or replace the insances with a ' ' (space)

-bruce


-----Original Message-----
From: python-list-bounces+bedouglas=earthlink.net at python.org
[mailto:python-list-bounces+bedouglas=earthlink.net at python.org]On Behalf
Of Simon Forman
Sent: Monday, July 03, 2006 7:17 PM
To: python-list at python.org
Subject: Re: ascii character - removing chars from string


bruce wrote:
> hi...
>
> update. i'm getting back html, and i'm getting strings like " foo  "
> which is valid HTML as the ' ' is a space.

&, n, b, s, p, ;  Those are all ascii characters.

> i need a way of stripping/removing the ' ' from the string
>
> the   needs to be treated as a single char...
>
>  text = "foo cat  "
>
>  ie ok_text = strip(text)
>
>  ok_text = "foo cat"

Do you really want to remove those html entities?  Or would you rather
convert them back into the actual text they represent?  Do you just
want to deal with  's?  Or maybe the other possible entities that
might appear also?

Check out htmlentitydefs.entitydefs (see
http://docs.python.org/lib/module-htmlentitydefs.html)  it's kind of
ugly looking so maybe use pprint to print it:

>>> import htmlentitydefs, pprint
>>> pprint.pprint(htmlentitydefs.entitydefs)
{'AElig': 'Æ',
 'Aacute': 'Á',
 'Acirc': 'Â',
.
.
.
 'nbsp': '\xa0',
.
.
.
etc...


HTH,
~Simon

"You keep using that word.  I do not think it means what you think it
means."
 -Inigo Montoya, "The Princess Bride"

--
http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list