Home > Blockchain >  Assigning a Series to a pandas Dataframe depending on multiple different column entries
Assigning a Series to a pandas Dataframe depending on multiple different column entries

Time:04-22

I have both a series as well as a DataFrame:

Series:
1    128
2    112
3    110

Where the Series index indicates a Month

DataFrame

   month     c1       c2
0  1         a       0
1  2         a       0
2  3         a       0
3  1         b       0
4  2         b       0
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

I want to apply the values of the series, so that the months match, but only where c1 has a specific value.

I tried using

df = ...
series = ... 

df.loc[df['c1'] == 'a', 'c2'] = series

But this won't work, as it doesnt use month as index. How can I make it regard the month in DataFrame?

Expected Result (c1 = a):

   month     c1       c2
0  1         a       128
1  2         a       112
2  3         a       110
3  1         b       0
4  2         b       0
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

Additionally, if i use the same series to apply for c1==b, the expected result is this (as there is no month 3, but only 5)

Expected Result (c1 = b):

   month     c1       c2
0  1         a       0
1  2         a       0
2  3         a       0
3  1         b       128
4  2         b       112
5  5         b       0
6  1         c       0
7  2         c       0
8  1         d       0
9  2         d       0

CodePudding user response:

For better performance filter in both sides with mapping by Series.map column month, if no match replace missing values by Series.fillna with downcast='int' parameter:

m = df['c1'] == 'a'
df.loc[m, 'c2'] = df.loc[m, 'month'].map(series).fillna(0, downcast='int')
print (df)
   month c1   c2
0      1  a  128
1      2  a  112
2      3  a  110
3      1  b    0
4      2  b    0
5      5  b    0
6      1  c    0
7      2  c    0
8      1  d    0
9      2  d    0


m = df['c1'] == 'b'
df.loc[m, 'c2'] = df.loc[m, 'month'].map(series).fillna(0, downcast='int')
print (df)
   month c1   c2
0      1  a    0
1      2  a    0
2      3  a    0
3      1  b  128
4      2  b  112
5      5  b    0
6      1  c    0
7      2  c    0
8      1  d    0
9      2  d    0

CodePudding user response:

IIUC, use map to convert your values from the series, and fillna to handle the potentially missing values, then assign and it will align automatically from the index:

df.loc[df['c1'] == 'b', 'c2'] = df['month'].map(series).fillna(0)

or, slice on both sides:

mask = df['c1'] == 'b'
df.loc[mask, 'c2'] = df.loc[mask, 'month'].map(series).fillna(0)

output:

   month c1   c2
0      1  a    0
1      2  a    0
2      3  a    0
3      1  b  128
4      2  b  112
5      5  b    0
6      1  c    0
7      2  c    0
8      1  d    0
9      2  d    0

CodePudding user response:

Another way to do this would be to turn your series into a df, and merge it onto the original df then fill the nas with 0s as below (and change the 'a' after 'c1': in join_df to change what letter you want to join on):

join_df = pd.DataFrame({'month':series.index,
                        'c1':['a']*len(series),
                        'c2':series})
df[['month', 'c1']].merge(join_df, on=['month', 'c1'], how='left').fillna(0)

Output:

   month c1     c2
0      1  a  128.0
1      2  a  112.0
2      3  a  110.0
3      1  b    0.0
4      2  b    0.0
5      5  b    0.0
6      1  c    0.0
7      2  c    0.0
8      1  d    0.0
9      2  d    0.0

For 'b', it outputs:

   month c1     c2
0      1  a    0.0
1      2  a    0.0
2      3  a    0.0
3      1  b  128.0
4      2  b  112.0
5      5  b    0.0
6      1  c    0.0
7      2  c    0.0
8      1  d    0.0
9      2  d    0.0
  • Related