I am looking for a way to simplify my code so that I can loop column by column mapping within a pandas dataframe. I have a dataframe (df_lib) containing 16 columns of letters and I map these letters to their corresponding numerical volumes.
volumes = {"G": 0, "A": 26.3, "S": 30.4, "T":56.2, "C":39.7, "F": 129.7, "I": 101.1, "L": 100.8,
"P": 59.3, "R": 129, "S": 30.4, "T": 56.2, "V": 75.3}
df_lib [16] = df_lib [0].map(volumes)
df_lib [17] = df_lib [1].map(volumes)
df_lib [18] = df_lib [2].map(volumes)
df_lib [19] = df_lib [3].map(volumes)
df_lib [20] = df_lib [4].map(volumes)
df_lib [21] = df_lib [5].map(volumes)
df_lib [22] = df_lib [6].map(volumes)
df_lib [23] = df_lib [7].map(volumes)
df_lib [24] = df_lib [8].map(volumes)
df_lib [25] = df_lib [9].map(volumes)
df_lib [26] = df_lib [10].map(volumes)
df_lib [27] = df_lib [11].map(volumes)
df_lib [28] = df_lib [12].map(volumes)
df_lib [29] = df_lib [13].map(volumes)
df_lib [30] = df_lib [14].map(volumes)
df_lib [31] = df_lib [15].map(volumes)
This code works, and gives me a 32 column dataframe (16 columns of letters and 16 columns of numbers). However, I was wondering if there is a more concise way to loop the column mapping so it is easier to make adjustments to future programs which may have different size dataframes or dataframes with different names? Thank you
CodePudding user response:
You can just use a for loop like this :
volumes = {"G": 0, "A": 26.3, "S": 30.4, "T":56.2, "C":39.7, "F": 129.7, "I": 101.1, "L": 100.8,
"P": 59.3, "R": 129, "S": 30.4, "T": 56.2, "V": 75.3}
for i in range(16):
df_lib[i 16] = df_lib[i].map(volumes)
Note that the range function starts with i=0 and stop with i=15.
CodePudding user response:
You could use df.applymap to perform the lookups, and then df.merge to combine the new data with the original.
Small example to demonstrate:
volumes = {'A': 0, 'B': 1, 'C': 2}
df = pd.DataFrame([['A', 'A'], ['A', 'C'], ['B', 'A']])
mapped = df.applymap(volumes.get)
combined = df.merge(mapped, left_index=True, right_index=True)
Which gives:
0_x 1_x 0_y 1_y
0 A A 0 0
1 A C 0 2
2 B A 1 0
If you want numerical column names like in your example, you can then do:
combined.columns = range(len(combined.columns))
0 1 2 3
0 A A 0 0
1 A C 0 2
2 B A 1 0
