xml processing : too slow...
Shagshag13
shagshag13 at yahoo.fr
Thu Jul 25 10:44:09 EDT 2002
> p.parse('<fict>%s</fict>' % line, 1)
>
> should be satisfactory for checking this kind of "sort of
> well-formedness", unless there are yet more specs as yet
> unexpressed.
that's why i had done :
>>> anotherline = '<root>' + line + '</root>'
>>> p.Parse(anotherline, 1)
Traceback (most recent call last):
File "<pyshell#14>", line 1, in ?
p.Parse(anotherline, 1)
ExpatError: junk after document element: line 1, column 0
but it still don't work, as much has:
>>> p.Parse('<fict>%s</fict>' % line, 1)
Traceback (most recent call last):
File "<pyshell#185>", line 1, in ?
p.Parse('<fict>%s</fict>' % line, 1)
ExpatError: junk after document element: line 1, column 0
> How would that help you diagnosed e.g.
> <bah thisis=notvalid>of course not</bah>
> as not being well formed? This is not well formed because
> it lacks quotes around an attribute's value. Or:
> <bah thisis="notvalid">&either</bah>
> now THIS is not well formed because reference '&either'
> is not terminated with a semicolon. Etc, etc.
that's right i didn't address this kind of thing... :(
> But _we_ can't know, unless you DO tell us the specs.
that's was the meaning of my example : anything among words, numbers, punctuations and some symbols like $.
space are separators. don't have entities like &entities; or consider it like a simple word.
thanks,
s13.
More information about the Python-list
mailing list