what module can do html encoder??

Gerrit gerrit at nl.linux.org
Mon Dec 13 06:09:20 EST 2004


richard wrote:
> Leon wrote:
> > example:
> >         s = ' ' --->  
> 
> That's technically not HTML encoding, that's replacing a perfectly valid
> space character with a *non-breaking* space character.

How can you tell?

s = ' ' # non-breaking space
s = ' ' # normal space
s = '῟' # em-space

But you might want to do something like:

def escapechar(s):
    import htmlentitydefs
    n = ord(s)
    if n < 128:
        return s.encode('ascii')
    elif n in htmlentitydefs.codepoint2name:
        return '&%s;' % htmlentitydefs.codepoint2name[n]
    else:
        return '&#%d;' % ord(s)

This requires unicode strings, because unicode encodings have multi-byte
characters. Demonstration:

>>> f(u'ò')
'ò'
>>> f(u'ş')
'ş'
>>> f(u's')
's'

yours,
Gerrit Holl.

-- 
Weather in Lulea / Kallax, Sweden 13/12 10:20:
	-15.0°C   wind 0.9 m/s NNW (34 m above NAP)
-- 
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
    -Dwight David Eisenhower, January 17, 1961



More information about the Python-list mailing list