playing with pyGoogle - strange codec error

Erik Max Francis max at alcyone.com
Tue Apr 5 14:55:48 EDT 2005


Brian Blazer wrote:

> You know, I am beginning to think that I MAY have stumbled on a bug 
> here.  At first I was thinking that this issue was related to the 
> offending character being out of range for the Mac.  Then I tried it on 
> A MS machine and a linux box; all with the same error.

The problem, common to all three, is that you're using a terminal whose 
default encoding doesn't specify a valid encoding for the copyright 
character (in the first case, the default encoding is 'ascii'; it is 
likely the case for the others, as well).

When you print a Unicode string, by default it is encoded to your 
default encoding.  The problem is this cannot be done faithfully with a 
string containing a non-ASCII symbol (like the copyright character which 
is actually triggering it for you).  So, consequently, the encoding is 
failing with an error.

What you probably want here is either to use another encoding, or to 
specify what to do in the case that the encoding is not possible. 
Either encode to a different encoding (one which you know your terminal 
supports even though it is not detected, e.g., 'latin-1'), or specify 
what to do with errors in the encoding (e.g., 'ignore', which removes 
the offending characters, or 'replace', which replaces them with 
question marks):

	aUnicodeString.decode('latin-1')
	aUnicodeString.decode('ascii', 'replace')

> This does not happen when I wrote the same script in java.  This is 
> making me wonder if there is an issue with the wrapper for the google 
> api that was originally done in java.

Java does not handle Unicode the same way.

-- 
Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
   Drifting from woman-who-tries misconstrued / Shifting to woman-wise
   -- Lamya



More information about the Python-list mailing list