[Tutor] [Pandas] - Creating a bigger dataframe by using one minor dataframe as variable of another minor dataframe

Henrique C. S. Junior henriquecsj at gmail.com
Sun Mar 11 16:02:07 EDT 2018


I’m working with two dataframes obtained from the infrared spectroscopy
analysis  (yes, we’re scientists trying to work better using python).
The dataframes are as follows:

***DATA1:***

        IR    RAMAN   CM-1
     245.54  730.41  3538.10   s1 100
       3.93  204.17  3237.13   s6 93
      11.13  477.42  3233.43   s3 14   s5 76
       3.44  136.83  3229.53   s3 -78   s5 15
       6.40  363.33  3219.42   s7 94
       8.02  296.03  3217.14   s2 -90
       6.13   80.90  3209.69   s9 93
       3.43  166.41  3204.74   s4 -92
       8.91  146.43  3203.92   s8 94
    26.77  201.97  3168.99   s10 82   s12 -18


***DATA2:***

    s 1     1.00   STRE   14   21   NH    1.015055  f3538 100
    s 2     1.00   STRE    1   22   CH    1.081994  f3217 90
    s 3    -1.00   STRE    1   22   CH    1.081994  f3233 14  f3230 78
    s 4     1.00   STRE    4   23   CH    1.082576  f3205 92
    s 5     1.00   STRE   19   26   CH    1.080387  f3237 93
    s 7     1.00   STRE   17   28   CH    1.083210  f3219 94
    s 8     1.00   STRE   17   28   CH    1.083210  f3204 94
    s 57    1.00   BEND   11   12   14   CCN   129.85  f940 20
    s 58    1.00   BEND   19   18   15   CCC   120.65
    s 59   -1.00   BEND   15   17   20   CCC   120.28  f1037 32  f842 10
    s 60    1.00   BEND   11    8   10   CCO   122.22  f402 38
    s 61   -1.00   BEND    8   11   13   CCCl   114.81  f221 60  f194 11d
    s 93    1.00   OUT    30   19   20   16   OCCC     0.17  f864 14  f746
11  f535 17  f517 11
    s 94    1.00   OUT     7    6    1    3   CCCC     0.48  f829 10  f172
41
    s 95    1.00   OUT    17   14   18   15   CNCC     3.64  f564 16  f535
15  f396 10
    s 96    1.00   OUT    10    6   11    8   OCCC     0.74  f822 63

At this point, they are both (after a LOT of work) converted into a pandas
dataframe but, as a final result, I’d like to create one big dataframe
because those two are related by the variables s1, s2, s3, s4 and so on.
So, I’d like to insert the variables s1, s2, s3, sn every time they appear
in the DATA1:

***DATA1 with DATA2:***
*(I tried to use ** to show you every time DATA 2 was inserted into DATA1)*

        IR    RAMAN   CM-1
     245.54  730.41  3538.10   *s1* 100 ***STRE   14   21   NH    1.015055*
**
       3.93  204.17  3237.13   *s6* 93 ***STRE   19   26   CH    1.080387*
**
      11.13  477.42  3233.43   *s3* ***14 STRE    1   22   CH    1.081994***
 *s5* 76 ***STRE   19   26   CH    1.080387***
       3.44  136.83  3229.53   *s3* -78 ***STRE    1   22   CH    1.081994***
*s5* 15  ***STRE   19   26   CH 1.080387***

Observe that some lines can have more than two variables.
I have to admit that I’m completely lost using pandas right now and any
help is much appreciated.

Thanks in advance.

-- 
*Henrique C. S. Junior*


More information about the Tutor mailing list