getattr/setattr still ASCII-only, not Unicode - blows up SGMLlibfrom BeautifulSoup

Terry Reedy tjreedy at udel.edu
Thu Mar 13 17:30:50 EDT 2008


"John Nagle" <nagle at animats.com> wrote in message 
news:47d97288$0$36363$742ec2ed at news.sonic.net...
|   Just noticed, again, that getattr/setattr are ASCII-only, and don't 
support
| Unicode.
|
|   SGMLlib blows up because of this when faced with a Unicode end tag:
|
| File "/usr/local/lib/python2.5/sgmllib.py", line 353, in finish_endtag
| method = getattr(self, 'end_' + tag)
| UnicodeEncodeError: 'ascii' codec can't encode character u'\xae'
| in position 46: ordinal not in range(128)
|
| Should attributes be restricted to ASCII, or is this a bug?

Except for comments and string literals preceded by an encoding 
declaration,
Python code is ascii only:
 " Python uses the 7-bit ASCII character set for program text."
ref manual 2. lexical analisis

This changes in 3.0






More information about the Python-list mailing list