parsing "&A" in a string..

bruce bedouglas at earthlink.net
Sun Aug 31 15:56:55 EDT 2008


Hi.

a pretty simple question, i'm guessing.

i have a text/html string that looks like:
 	....(A&E)

the issue i have is that when i parse it using xpath/node/toString,
i get the following

...(A&E;).

note the semicolon ";". I've tried to use the encoding function of toString
with no luck..

the test chunk of code i'm using is:
.
.
.
dpath="//div/ul[@id='leftNavListing']/li[position()>0]//a/text()"
ldepts_=d.xpath(dpath)
if len(ldepts_)>0:
	for ldept in ldepts_:
		dept=ldept.nodeValue
		print "dept =",ldept.toString()
		start=re.search("\(",dept).span()
		end=re.search("\)",dept).span()
		print start,end
		print dept[start[0]],dept[end[0]]
		dept=dept[start[1]:end[0]]
		print dept
.
.
.

so, any thoughts/pointers as to how i can remove the ";" would be helpful.
i'm assuming that there is a way to enforce a given encoding, that would
remove the ";" issue...

thanks




More information about the Python-list mailing list