[New-bugs-announce] [issue41964] difflib SequenceMatcher get_matching_blocks returns non-matching blocks in some cases

Snidhi Sofpro report at bugs.python.org
Wed Oct 7 03:34:46 EDT 2020


New submission from Snidhi Sofpro <Snidhi.Sofpro at gmail.com>:

---------- Demo case with unexpected results starting from matching block 3 (result of code that follows):

sys.version_info(major=3, minor=6, micro=9, releaselevel='final', serial=0) 

Matches between:
<a id="nhix_Rgstr" href="http://local:56067/register/200930162135700">
<a id="nhix_Rgstr" href="http://local:53813/register/20100517282450281">

Match(a=0, b=0, size=39)
same-> <a id="nhix_Rgstr" href="http://local:5
same=> <a id="nhix_Rgstr" href="http://local:5 

Match(a=43, b=43, size=12)
same-> /register/20
same=> /register/20 

Match(a=59, b=55, size=1)
same-> 1
same=> 0 

Match(a=66, b=56, size=2)
same-> 00
same=> 93 

Match(a=68, b=70, size=2)
same-> ">
same=>  


# ---------- code that results in the above:

def get_mblk(dpiy_Frst, dpiy_Scnd):
    import difflib;
    sqmn_o = difflib.SequenceMatcher(None, dpiy_Frst, dpiy_Scnd);
    mblk_ls = [ block for block in sqmn_o.get_matching_blocks()];
    for mblk in mblk_ls[:-1]: #exclude the last dummy block
        print(mblk);
        mtch_a = dpiy_Frst[mblk.a : mblk.a + mblk.size];
        mtch_b = dpiy_Frst[mblk.b : mblk.b + mblk.size];
        print('same->', mtch_a);
        print('same=>', mtch_b, '\n');
    #endfor
#endef get_mblk

# --- main --

s1='<a id="nhix_Rgstr" href="http://local:56067/register/200930162135700">'
s2='<a id="nhix_Rgstr" href="http://local:53813/register/20100517282450281">'

import sys; print(sys.version_info, '\n');
print("Matches between:"); print(s1); print(s2); print('\n');
get_mblk(s1, s2);

----------
messages: 378149
nosy: Snidhi
priority: normal
severity: normal
status: open
title: difflib SequenceMatcher get_matching_blocks returns non-matching blocks in some cases
versions: Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41964>
_______________________________________


More information about the New-bugs-announce mailing list