Problem with concatenating two dataframes

MRAB python at mrabarnett.plus.com
Sat Nov 6 15:50:56 EDT 2021


On 2021-11-06 16:16, Mahmood Naderan via Python-list wrote:
> In the following code, I am trying to create some key-value pairs in a dictionary where the first element is a name and the second element is a dataframe.
> 
> # Creating a dictionary
> data = {'Value':[0,0,0]}
> kernel_df = pd.DataFrame(data, index=['M1','M2','M3'])
> dict = {'dummy':kernel_df}
> # dummy  ->          Value
> #               M1      0
> #               M2      0
> #               M3      0
> 
> 
> Then I read a file and create some batches and compare the name in the batch with the stored names in dictionary. If it doesn't exist, a new key-value (name and dataframe) is created. Otherwise, the Value column is appended to the existing dataframe.
> 
> 
> df = pd.read_csv('test.batch.csv')
> print(df)
> for i in range(0, len(df), 3):
>      print("\n------BATCH BEGIN")
>      batch_df = df.iloc[i:i+3]
>      name = batch_df.loc[i].at["Name"]
>      values = batch_df.loc[:,["Value"]]
>      print(name)
>      print(values)
>      print("------BATCH END")
>      if name in dict:
>          # Append values to the existing key
>          dict[name] = pd.concat( dict[name],values )   #### ERROR
>      else:
>          # Create a new pair in dictionary
>          dict[name] = values;
> 
> 
> 
> As you can see in the output, the join statement has error.
> 
> 
> 
>      ID Name Metric  Value
> 0   0   K1     M1     10
> 1   0   K1     M2      5
> 2   0   K1     M3     10
> 3   1   K2     M1     20
> 4   1   K2     M2     10
> 5   1   K2     M3     15
> 6   2   K1     M1      2
> 7   2   K1     M2      2
> 8   2   K1     M3      2
> 
> ------BATCH BEGIN
> K1
>     Value
> 0     10
> 1      5
> 2     10
> ------BATCH END
> 
> ------BATCH BEGIN
> K2
>     Value
> 3     20
> 4     10
> 5     15
> ------BATCH END
> 
> ------BATCH BEGIN
> K1
>     Value
> 6      2
> 7      2
> 8      2
> ------BATCH END
> 
> 
> 
> 
> As it reaches the contact() statement, I get this error:
> 
> TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"
> 
> 
> Based on the definition I wrote in the beginning of the code, "dict[name]" should be a dataframe. Isn't that?
> 
> How can I fix that?
> 
You're trying to concatenate by passing the 2 items as the first 2 
arguments to pd.concat, but I think that you're supposed to pass them as 
an _iterable_, e.g. a list, as the first argument to pd.concat.

Try this instead:

     dict[name] = pd.concat([dict[name], values])


More information about the Python-list mailing list