how to find difference in number of characters
Diez B. Roggisch
deets at web.de
Sat Oct 9 14:39:11 EDT 2010
harryos <oswald.harry at gmail.com> writes:
> On Oct 9, 4:52 pm, Peter Otten <__pete... at web.de> wrote:
>
>>
>> You might get more/better answers if you tell us more about the context of
>> the problem and add some details that may be relevant.
>>
>> Peter
>
> I am trying to determine if a wep page is updated by x number of
> characters..Mozilla firefox plugin 'update scanner' has a similar
> functionality ..A user can specify the x ..I think this would be done
> by reading from the same url at two different times and finding the
> change in body text..I was wondering if difflib could offer something
> in the way of determining the size of delta..
If you normalize the data, this might be worth trying.
Make all tags appear on one single line, possibly re-order attributes so
that they are in alphabetical order. Each text child git's also
normalized, by replacing all whitespace with a single space.
Then run difflib over these, and count the number of diffrences.
Diez
More information about the Python-list
mailing list