Compare files excel

Peter Otten __peter__ at web.de
Sun Jul 23 07:21:23 EDT 2017


Smith wrote:

> On 22/07/2017 22:21, Albert-Jan Roskam wrote:
>> df1['difference'] = (df1 == df2).all(axis=1)
> 
> below here there is the mistake :
> 
> In [17]: diff = df1['difference'] = (df1 == df2).all(axis=1)
> 
---------------------------------------------------------------------------
> ValueError                                Traceback (most recent call
> last) <ipython-input-17-195a2c4caf00> in <module>()
> ----> 1 diff = df1['difference'] = (df1 == df2).all(axis=1)
> 
> /usr/local/lib/python3.5/dist-packages/pandas/core/ops.py in f(self,
> other)
>     1295     def f(self, other):
>     1296         if isinstance(other, pd.DataFrame):  # Another DataFrame
> -> 1297             return self._compare_frame(other, func, str_rep)
>     1298         elif isinstance(other, ABCSeries):
>     1299             return self._combine_series_infer(other, func)
> 
> /usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in
> _compare_frame(self, other, func, str_rep)
>     3570     def _compare_frame(self, other, func, str_rep):
>     3571         if not self._indexed_same(other):
> -> 3572             raise ValueError('Can only compare identically-labeled
> '
>     3573                              'DataFrame objects')
>     3574         return self._compare_frame_evaluate(other, func, str_rep)
> 
> ValueError: Can only compare identically-labeled DataFrame objects

The columns of both dataframes must be identical. Compare:

>>> import pandas as pd
>>> a = pd.DataFrame([[1,2],[3,4]], columns=["a", "b"])
>>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "c"])

With different column names:

>>> a != b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/core/ops.py", line 875, in f
    return self._compare_frame(other, func, str_rep)
  File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2860, in 
_compare_frame
    raise ValueError('Can only compare identically-labeled '
ValueError: Can only compare identically-labeled DataFrame objects

Again, with identical column names:

>>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "b"])
>>> a != b
       a      b
0  False  False
1  False   True





More information about the Python-list mailing list