[Tutor] formatting xml (again)

David Rock david at graniteweb.com
Tue Dec 27 19:46:33 EST 2016


* Alan Gauld via Tutor <tutor at python.org> [2016-12-28 00:40]:
> On 27/12/16 19:44, richard kappler wrote:
> > Using python 2.7 - I have a large log file we recorded of streamed xml data
> > that I now need to feed into another app for stress testing. The problem is
> > the data comes in 2 formats.
> > 
> > 1. each 'event' is a full set of xml data with opening and closing tags +
> > x02 and x03 (stx and etx)
> > 
> > 2. some events have all the xml data on one 'line' in the log, others are
> > in typical nested xml format with lots of white space and multiple 'lines'
> > in the log for each event, the first line of th e 'event' starting with an
> > stx and the last line of the 'event' ending in an etx.
> 
> It sounds as if an xml parser should work for both. After all
> xml doesn't care about layout and whitespace etc.
> 
> Which xml parser are you using - I assume you are not trying
> to parse it manually using regex or string methjods - that's
> rarely a good idea for xml.

Yeah, since everything appears to be <data>..</data>, the "event" flags
of [\x02] [\x03] may not even matter if you use an actual parser.

-- 
David Rock
david at graniteweb.com


More information about the Tutor mailing list