ascii character - removing chars from string

bruce bedouglas at earthlink.net
Tue Jul 4 12:37:53 EDT 2006


thanks for your replies!!

the solution..

 dd = dd.replace(u'\xa0','')

this allows the nbsp hex representation to be replaced with a ''. i thought
i had tried this early in the process.. but i may have screwed up the
typing...

-bruce


-----Original Message-----
From: python-list-bounces+bedouglas=earthlink.net at python.org
[mailto:python-list-bounces+bedouglas=earthlink.net at python.org]On Behalf
Of Steven D'Aprano
Sent: Tuesday, July 04, 2006 9:35 AM
To: python-list at python.org
Subject: RE: ascii character - removing chars from string


On Tue, 04 Jul 2006 09:01:15 -0700, bruce wrote:

> update...
>
> the error i'm getting...
>
>>>>UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in
> position 62: ordinal not in range(128)

Okay, now we're making progress -- we know what exception you're getting.
Now, how about telling us what you did to get that exception?


> is there a way i can tell/see what the exact char is at pos(62). i was
> assuming that it's the hex \xa0.

That's what it's saying.

> i've done the s.replace('\xa0','') with no luck.

What does that mean? What does it do? Crash? Raise an exception? Return a
string you weren't expecting? More detail please.

Here is some background you might find useful. My apologies if you already
know it:

"Ordinary" strings in Python are delimited with quote marks, either
matching " or '. At the risk of over-simplifying, these strings can
contain only single-byte characters, i.e. ordinal values 0 through 255, or
in hex, 0 through FF. The character you are having a problem with is
within that range of single bytes: ord(u'\xa0') = 160.

Notice that a string '\xa0' is a single byte; a Unicode string u'\xa0' is
a different type of object, even though it has the same value.

String methods will blindly operate on any string, regardless of what
bytes are in them. However, converting from unicode to ordinary strings is
NOT the same -- the *character* chr(160) is not a valid ASCII character,
since ASCII only uses the range chr(0) through chr(127).

If this is confusing to you, you're not alone.



--
Steven.

--
http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list