Newbie: How to convert "<" to "&lt;" ( encoding? )

Fredrik Aronsson d98aron at dtek.chalmers.se
Tue Oct 3 19:54:54 EDT 2000


In article <9UsC5.27$fn2.59024 at news.pacbell.net>,
	"Eric" <ewalstad at yahoo.com> writes:
>> replace (str, old, new[, maxsplit]) -> string
> Thanks Bjorn.  That DOES take care of the two examples 
> I gave.  However, I am trying to implement something 
> that will handle all the encoding needed to make a string 
> "HTML friendly."  I'm sorry, there is a name for the 
> kind of encoding I am trying to do, but I don't know 
> what that name is (HTML encoding?).

Here is an example which replaces all characters with HTML entities.

import string

# Load dictionary of entities (HTML 2.0 only...)
from htmlentitydefs import entitydefs
# Here you could easily add more entities if needed...

def html_encode(s):
    s = string.replace(s,"&","&")  # replace "&" first

    #runs one replace for each entity except "&"
    for (ent,char) in entitydefs.items():
        if char != "&": 
            s = string.replace(s,char,"&"+ent+";")
    return s

>>> print html_encode("this <is> a <string> with <nested <brackets>>")
this <is> a <string> with <nested <brackets>>
>>> print html_encode("&<>åäöÅÄÖߣéèãñîê")
&<>åäöÅÄÖߣ
éèãñîê
>>> 

Another (probably better) solution is:

import string

from htmlentitydefs import entitydefs  

inv_entitydefs = {}
for (ent,char) in entitydefs.items():
    inv_entitydefs[char]="&"+ent+";"  # Invert dictionary

def html_encode2(s):
    res=""
    for c in s:                       # Just loops through the string once
        if inv_entitydefs.has_key(c): # looking for characters
            res=res+inv_entitydefs[c] # to exchange
        else:
            res=res+c
    return res

made-them-two-minutes-ago-so-be-careful-ly' yours
Fredrik





More information about the Python-list mailing list