ascii character - removing chars from string

Steven D'Aprano steve at REMOVETHIScyber.com.au
Tue Jul 4 12:34:57 EDT 2006


On Tue, 04 Jul 2006 09:01:15 -0700, bruce wrote:

> update...
> 
> the error i'm getting...
> 
>>>>UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in
> position 62: ordinal not in range(128)

Okay, now we're making progress -- we know what exception you're getting.
Now, how about telling us what you did to get that exception?


> is there a way i can tell/see what the exact char is at pos(62). i was
> assuming that it's the hex \xa0.

That's what it's saying.

> i've done the s.replace('\xa0','') with no luck.

What does that mean? What does it do? Crash? Raise an exception? Return a
string you weren't expecting? More detail please.

Here is some background you might find useful. My apologies if you already
know it:

"Ordinary" strings in Python are delimited with quote marks, either
matching " or '. At the risk of over-simplifying, these strings can
contain only single-byte characters, i.e. ordinal values 0 through 255, or
in hex, 0 through FF. The character you are having a problem with is
within that range of single bytes: ord(u'\xa0') = 160.

Notice that a string '\xa0' is a single byte; a Unicode string u'\xa0' is
a different type of object, even though it has the same value.

String methods will blindly operate on any string, regardless of what
bytes are in them. However, converting from unicode to ordinary strings is
NOT the same -- the *character* chr(160) is not a valid ASCII character,
since ASCII only uses the range chr(0) through chr(127).

If this is confusing to you, you're not alone.



-- 
Steven.




More information about the Python-list mailing list