Home > Back-end >  How can I get value from dataframe/matrix into tuple of list
How can I get value from dataframe/matrix into tuple of list

Time:01-12

I have a matrix that store values like table below:

play_tv play_series Null purchase Conversion
Start 0.02 0.03 0.04 0.05 0.06
play_series 0.07 0.08 0.09 0.10 0.11
play_tv 0.12 0.13 0.14 0.15 0.16
Null 0.17 0.18 0.19 0.20 0.21
purchase 0.22 0.23 0.24 0.25 0.26
Conversion 0.27 0.28 0.29 0.30 0.31

and I have dataframe like this below:

session_id path path_pair
T01 [Start, play_series, Null] [(Start, play_series),( play_series, Null)]
T02 [Start, play_tv, purchase, Conversion] [(Start, play_tv),(play_tv, purchase),(purchase, Conversion)]

I want to get value from the matrix to replace column path_pair or create new column in my current dataframe. It's choose be list of values and How can I do that?

[(Start, play_series), (play_series, Null)] -> [0.03, 0.09]

[(Start, play_tv), (play_tv, purchase), (purchase, conversion)] -> [0.02, 0.15, 0.26 ]

result I want:

session_id path path_pair
T01 [Start, play_series, Null] [0.03, 0.09]
T02 [Start, play_tv, purchase, Conversion] [0.02, 0.15, 0.26]

script I try to get value from the matrix:

trans_matrix[trans_matrix.index=="Start"]["play_series"].values[0]

CodePudding user response:

Given your input:

df1 = pd.DataFrame({'play_tv': [0.02, 0.07, 0.12, 0.17, 0.22, 0.27],
                   'play_series': [0.03, 0.08, 0.13, 0.18, 0.23, 0.28],
                   'Null': [0.04, 0.09, 0.14, 0.19, 0.24, 0.29],
                   'purchase': [0.05, 0.1, 0.15, 0.2, 0.25, 0.3],
                   'Conversion': [0.06, 0.11, 0.16, 0.21, 0.26, 0.31]}, 
                  index=['Start','play_series','play_tv','Null','purchase','Conversion'])
df2 = pd.DataFrame({'session_id': ['T01', 'T02'],
                    'path': [['Start', 'play_series', 'Null'],
                             ['Start', 'play_tv', 'purchase', 'Conversion']],
                    'path_pair': [[('Start', 'play_series'),( 'play_series', 'Null')],
                                  [('Start', 'play_tv'),('play_tv', 'purchase'),('purchase', 'Conversion')]]})

You can update df2 by applying a function to column 'path_pair' that looks up values in df1:

df2['path_pair'] = df2['path_pair'].apply(lambda lst: [df1.loc[x,y] for (x,y) in lst])

Output:

  session_id                                    path           path_pair
0        T01              [Start, play_series, Null]        [0.03, 0.09]
1        T02  [Start, play_tv, purchase, Conversion]  [0.02, 0.15, 0.26]
  •  Tags:  
  • Related