decode Numeric Character References to unicode

Ben Finney bignose+hates-spam at benfinney.id.au
Mon Feb 18 07:31:15 EST 2008


7stud <bbxx789_05ss at yahoo.com> writes:

> For instance, an 'o' with umlaut can be represented in three
> different ways:
> 
> '&' followed by 'ouml;'
> '&' followed by '#246;'
> '&' followed by '#xf6;'

The fourth way, of course, is to simply have 'ö' appear directly as a
character in the document, and set the correct character encoding.
(Hint: UTF-8 is an excellent choice for "the correct character
encoding", if you get to choose.)

-- 
 \        “With Lisp or Forth, a master programmer has unlimited power |
  `\     and expressiveness. With Python, even a regular guy can reach |
_o__)                               for the stars.” —Raymond Hettinger |
Ben Finney



More information about the Python-list mailing list