pandas finding field difference between two dataframes

zljubisic at gmail.com zljubisic at gmail.com
Mon Oct 30 04:25:14 EDT 2017


Hi,

I have to compare two pandas dataframes and find difference between each row.
For example, in the code bellow, rows with index 0 and 3 are intentionally different.
Row 0 is different in field A, and row 3 is different in field 3.

After merging dataframes, I can concentrate to the dfm with dfm['_merge'] != 'both'.
How to find which field is different?
End result should be:
row 0, A
row 3, B

Regards.


import pandas as pd
import numpy as np


idx = pd.date_range('01.01.2017', periods=7, freq='D')
A = [False, True, True, False, True, False, True]
B = np.random.randn(7)
C = np.random.randn(7)

data1 = { 'A' : [False, True, True, False, True, False, True],
        'B' : list(range(len(idx))),
        'C' : list(range(len(idx), len(idx) + len(idx)))
        }

dfl = pd.DataFrame(data1, index=idx)

data2 = data1.copy()
data2['A'][0] = True
data2['B'][3] = 30

dfr = pd.DataFrame(data2, index=idx)


# dfm = dfl.merge(dfr, indicator=True, how='outer', left_on=dfl.index, right_on = dfr.index)
dfm = dfl.merge(dfr, indicator=True, how='outer')

print(dfl)
print(dfr)
print(dfm)

print(dfl.iloc[[0,3]])
print(dfr.iloc[[0,3]])

Output:

                A  B   C
2017-01-01  False  0   7
2017-01-02   True  1   8
2017-01-03   True  2   9
2017-01-04  False  3  10
2017-01-05   True  4  11
2017-01-06  False  5  12
2017-01-07   True  6  13
                A   B   C
2017-01-01   True   0   7
2017-01-02   True   1   8
2017-01-03   True   2   9
2017-01-04  False  30  10
2017-01-05   True   4  11
2017-01-06  False   5  12
2017-01-07   True   6  13
       A   B   C      _merge
0  False   0   7   left_only
1   True   1   8        both
2   True   2   9        both
3  False   3  10   left_only
4   True   4  11        both
5  False   5  12        both
6   True   6  13        both
7   True   0   7  right_only
8  False  30  10  right_only
Backend TkAgg is interactive backend. Turning interactive mode on.
                A  B   C
2017-01-01  False  0   7
2017-01-04  False  3  10
                A   B   C
2017-01-01   True   0   7
2017-01-04  False  30  10



More information about the Python-list mailing list