[Expat-discuss] & symbol workaround
Brad Causey
bradcausey at gmail.com
Wed Feb 4 20:56:11 CET 2009
Hi list,
I am working on a Python script that parses around 6800 small xml files.
My code isn't pretty, as I am just testing a PoC at this point, but I have
run into a problem. When the script hits the Ampersand symbol, it quits with
"xml.parsers.expat.ExpatError: not well-formed (invalid token): line 28,
column 41"
I am trying to figure out a way to work around this without modifying the
XML files themselves as these need to be preserved in the original format.
Here is my code:
<begin code>
import xml.parsers.expat
import string
import os
#var setup
list = []
values = []
indexy =
('RulesVersion','AuditDate','ComputerName','UserName','UserDomain','OSName','OSServicePack','OSBuild','AntiVirusProduct','ExeVersion','SigsVersion','Active','Timeout','PasswordRequired','PasswordLength','Modem','Dialtone')
out = open('test.txt','w')
#handler functions
def start_element(name, attrs):
name = str(name)
list.append(name)
def end_element(name):
name = str(name)
list.append(name)
def char_data(data):
data = str(data)
list.append(data)
#file parsing
xlist = os.popen (r"dir /od /a-d /b *.xml").read ().splitlines ()
for i in xlist:
print i
p = xml.parsers.expat.ParserCreate('ASCII')
p.StartElementHandler = start_element
p.EndElementHandler = end_element
p.CharacterDataHandler = char_data
values.append(i)
file = open(i,'r')
p.ParseFile(file)
for item in indexy:
check = item
try:
item = list.index(item)
if check == 'AntiVirusProduct':
values.append(list[item+3])
elif check == 'Modem':
values.append(list[item+3])
else:
values.append(list[item+1])
except:
values.append('NOT FOUND')
file.close()
print values
list =[]
values =[]
<end code>
-B
More information about the Expat-discuss
mailing list