Home > Mobile >  Conditional merge and transformation of data in pandas
Conditional merge and transformation of data in pandas

Time:01-06

I have two data frames, and I want to create new columns in frame 1 using properties from frame 2

frame 1

    Name
    alice
    bob
    carol

frame 2

    Name  Type  Value
    alice lower 1
    alice upper 2
    bob   equal 42
    carol lower 0

desired result

frame 1

    Name   Lower Upper
    alice      1     2
    bob       42    42
    carol      0    NA
    

Hence, the common column of both frames is Name. You can use Name to look up bounds (of a variable), which are specified in the second frame. Frame 1 lists each name exactly once. Frame 2 might have one or two entries per frame, which might either specify a lower or an upper bound (or both at a time if the type is equal). We do not need to have both bounds for each variable, one of the bounds can stay empty. I would like to have a frame that lists the range of each variable. I see how I can do that with for-loops over the columns, but that does not seem to be in the pandas spirit. Do you have any suggestions for a compact solution? :-) Thanks in advance

CodePudding user response:

You're not looking for a merge, but rather a pivot.

(df2[df2['Name'].isin(df1['Name'])]
 .pivot('Name', 'Type', 'Value')
 .reset_index()
)

But this doesn't handle the special 'equal' case.

For this, you can use a little trick. Replace 'equal' by a list with the other two values and explode to create the two rows.

(df2[df2['Name'].isin(df1['Name'])]
 .assign(Type=lambda d: d['Type'].map(lambda x: {'equal': ['lower', 'upper']}.get(x,x)))
 .explode('Type')
 .pivot('Name', 'Type', 'Value')
 .reset_index()
 .convert_dtypes()
)

Output:

    Name  lower  upper
0  alice      1      2
1    bob     42     42
2  carol      0   <NA>
  •  Tags:  
  • Related