Python to do CDC on XML files

Peter Otten __peter__ at web.de
Thu Mar 24 04:19:32 EDT 2016


Bruce Kirk wrote:

> Does anyone know of any existing projects on how to generate a change data
> capture on 2 very large xml files.
> 
> The xml structures are the same, it is the data within the files that may
> differ.
> 
> I need to take a XML file from yesterday and compare it to the XML file
> produced today and not which XML records have changed.
> 
> I have done a google search and I am not able to find much on the subject
> other than software vendors trying to sell me their products. :-)

There is

http://www.logilab.org/project/xmldiff

As an alternative you may try to log the changes as they occur instead of 
inspecting the result. If the application generating the file is not under 
your control, does it offer other output formats, e. g. csv?

Or if the xml file is basically a sequence of one type of node you may 
convert it to a database (sqlite will do) to match and compare the 
"records".




More information about the Python-list mailing list