[Doc-SIG] Diffing reStructuredText documents that only differ by formatting

Martin Blais blais at furius.ca
Tue Mar 3 18:59:59 CET 2009


On Tue, 3 Mar 2009 17:57:12 +0000 (UTC), "Jeffrey C. Jacobs" <docutils.org.timehorse at neverbox.com> said:
> Gael Varoquaux <gael.varoquaux <at> normalesup.org> writes:
> > 
> > wdiff.
> 
> Thanks for the suggestions!  Unfortunately, one thing I forgot to mention
> was that the concatenations should not span different paragraphs.  Thus:
> 
> Hello!  World!
> 
> is not the same as:
> 
> Hello!
> 
> World!
> 
> Since the first represents 2 paragraphs, but the second only 1.
> 
> Instead, I propose the following python script that diffs the docutil
> trees instead of the original text files.  I don't know how it could tell
> whether the 2 imputs are reStructuredText documents vs. regular text
> documents and only perform the doc-tree step if rst, and am welcome to
> suggestions for improvements but so far this does a good job of what I am
> trying to achieve.  Such a tool could be handy to rst documenters in
> cases where a document may have a bunch of lines through years of editing
> that go beyond 80 columns and thus the file is edited to bring it back in
> line, which produces massive standard diffs when the result really should
> more or less be the same document.  This script could be used to confirm
> that the two versions of documents are more or less the same.

This is great. BTW if you want to inspect your diffs graphically, you can tell xxdiff to use your program to compute the differences. It'll work if your program outputs POSIX diffs (which it likely does, because you're invoking GNU diff).



More information about the Doc-SIG mailing list