I have a list and dict like as shown below
col_indices = [df.columns.tolist().index(col) for col in cat_cols]
print(col_indices) #returns [1,5]
t = {'thisdict':{
"Ford":"brand",
"Mustang":"model",
1964:"year"
},
'thatdict':{
"jfsak":"af",
"jhas":"asjf"}}
Basically, I would like to replace dict keys with their corresponding column indices.
For ex: column index 1 belongs to thisdict and column index 5 belongs to thatdict.
I was trying something like below but doesn't work.
key_map_dict = {'1':'thisdict','5':'thatdict'}
d = {(key_map_dict[k] if k in key_map_dict else k):v for (k,v) in t.items() }
Instead of me manually defining key_map_dict. Is there anyway to find the matching column names and get the index position and do the replacement in dicts automatically? I cannot do this for big data frame of million rows and 200 columns.
I expect my output to be like as shown below
{1:{
"Ford":"brand",
"Mustang":"model",
1964:"year"
},
5:{
"jfsak":"af",
"jhas":"asjf"}}
CodePudding user response:
You can use zip and dict comprehension:
col_indices = [1, 5]
t = {'thisdict': {"Ford": "brand", "Mustang": "model", 1964: "year"},
'thatdict': {"jfsak": "af", "jhas": "asjf"}}
output = {i: v for i, v in zip(col_indices, t.values())}
print(output)
# {1: {'Ford': 'brand', 'Mustang': 'model', 1964: 'year'}, 5: {'jfsak': 'af', 'jhas': 'asjf'}}
CodePudding user response:
Another option
df_list = df.columns.tolist()
t = {df_list.index(k): v for k, v in t.items()}
Btw, if you want to combine with your previous question here, you can try this
df_list = df.columns.tolist()
b = {df_list.index(tk): {v: k for k, v in tv.items()} for tk, tv in t.items()}
CodePudding user response:
To replace the keys in your dictionary t with their column index in the DataFrame you can lookup the index of the corresponding column in the DataFrame and assign it to a value in t like this:
import pandas
# Provided t
t = {'thisdict': {
"Ford": "brand",
"Mustang": "model",
1964: "year"
},
'thatdict': {
"jfsak": "af",
"jhas": "asjf"}
}
# Assumed df looks something like this
dct = {'thisdict': ['abc'],
'thatdict': ['def']}
df = pandas.DataFrame(dct)
output = {df.columns.get_loc(name): dct for name, dct in t.items()}
print(output)
Output:
{0: {'Ford': 'brand', 'Mustang': 'model', 1964: 'year'}, 1: {'jfsak': 'af', 'jhas': 'asjf'}}
Note: This relies on all the keys in t existing in your DataFrame, but it would be relatively trivial to add checks if t is not one-to-one with the DataFrame.
