A question about unicode() function
JTree
eastera at gmail.com
Mon Jan 1 02:07:08 EST 2007
Hi,
I changed my codes to:
#!/usr/bin/python
#Filename: test.py
#Modified: 2007-01-01
import cPickle as p
import urllib
import htmllib
import re
import sys
funUrlFetch = lambda url:urllib.urlopen(url).read()
objUrl = raw_input('Enter the Url:')
content = funUrlFetch(objUrl)
content = content.encode('gb2312','ignore')
print content
content.close()
I used "ignore" to deal with the data lose, but it still caused a
error:
C:\WINDOWS\system32\cmd.exe /c python tianya.py
Enter the Url:http://www.tianya.cn
Traceback (most recent call last):
File "tianya.py", line 17, in ?
content = content.encode('gb2312','ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbb in position
88: ordinal not in range(128)
shell returned 1
Hit any key to close this window...
My python version is 2.4, Does it have some problems with asian
encoding support?
Thanks!
On Dec 31 2006, 9:30 pm, "Felipe Almeida Lessa"
<felipe.le... at gmail.com> wrote:
> On 31 Dec 2006 05:20:10 -0800, JTree <east... at gmail.com> wrote:
>
> > def funUrlFetch(url):
> > lambda url:urllib.urlopen(url).read()This function only creates a lambda function (that is not used or
> assigned anywhere), nothing more, nothing less. Thus, it returns None
> (sort of "void") no matter what is its argument. Probably you meant
> something like
>
> def funUrlFetch(url):
> return urllib.urlopen(url).read()
>
> or
>
> funUrlFetch = lambda url:urllib.urlopen(url).read()
>
> > objUrl = raw_input('Enter the Url:')
> > content = funUrlFetch(objUrl)content gets assigned None. Try putting "print content" before the unicode line.
>
> > content = unicode(content,"gbk")This, equivalent to unicode(None, "gbk"), leads to
>
> > TypeError: coercing to Unicode: need string or buffer, NoneType foundNone's are not strings nor buffers, so unicode() complains.
>
> See ya,
>
> --
> Felipe.
More information about the Python-list
mailing list