Home > Back-end >  Utilizing nested dictionary to replace specific values in a Pandas Dataframe
Utilizing nested dictionary to replace specific values in a Pandas Dataframe

Time:01-21

I have a pandas dataframe where I need to replace all the "yes" values with the corresponding number from a nested dictionary. The nested dictionary is composed of the row values of column 'Store' as the outer keys. The inner keys are the columns named 'A' and 'B'.

This is the dataframe:

import pandas as pd

data = [['abc', 'jan','yes','no'], ['abc', 'feb','no','yes'], ['def', 'jan', 'yes','yes'],  ['def', 'feb', 'no','yes']]

df = pd.DataFrame(data, columns = ['Store', 'Month', 'A','B' ])

df
 
  Store Month    A    B
0   abc   jan  yes   no
1   abc   feb   no  yes
2   def   jan  yes  yes
3   def   feb   no  yes

This is the nested dictionary:

# dict = {row value in 'Store' column:{column:point value}}
dict = {'abc':{'A':5,'B':4},'def':{'A':3,'B':2}}

This is the desired output:

  Store Month    A    B
0   abc   jan    5   no
1   abc   feb   no    4
2   def   jan    3    2
3   def   feb   no    2

CodePudding user response:

Try this:

def find_num(store, col_value, col_name):
    if col_value == "yes":
        sub_dict = dict[store]
        return sub_dict[col_name]
    else:
        return "no"


for col in list(df.columns):
    if col == "Store" or col == "Month":
        continue
    df[col] = df.apply(lambda x: find_num(x['Store'], x[col], col), axis=1)

print(df)

CodePudding user response:

Replace 'yes' by np.nan and fillna by the values of your nested dict:

d = {'abc':{'A':5,'B':4},'def':{'A':3,'B':2}}
out = df.replace({'yes': np.nan}).groupby('Store') \
        .apply(lambda x: x.fillna(d[x.name])).droplevel(0)
print(out)

# Output
  Store Month   A   B
0   abc   jan   5  no
1   abc   feb  no   4
2   def   jan   3   2
3   def   feb  no   2
  •  Tags:  
  • Related