Home > Net >  Fill na values in one pandas dataframe's column using another, but using column indices, not na
Fill na values in one pandas dataframe's column using another, but using column indices, not na

Time:01-12

I have a peculiar situation in which I know that the 0th column may contain nan, and in all these cases, the 1st column contains the missing value. I do not know the names of these columns ahead of time, so I want to select them by index instead.

I am able to select the column like so:

df.iloc[:, [0]]

And normally I am able to fill na values from another column like this, if I knew the names:

df["col0"].fillna(df["col1"])

So I thought that I should be able to replicate this by doing this:

df.iloc[:, [0]].fillna(df[:, [1]])

But I get:

Traceback (most recent call last):
  File "/home/---------/.pycharm_helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec
    exec(exp, global_vars, local_vars)
  File "<input>", line 1, in <module>
  File "/home/---------/_code/microgrid-support/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/---------/_code/microgrid-support/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 82, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(slice(None, None, None), [1])' is an invalid key

How can I fill the na values using the column index instead of its name?

CodePudding user response:

You could use bfill with iloc

import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[np.nan,2,3],'b':[100,200,300],'c':['x','y','z']})

df.iloc[:,:2] = df.iloc[:,:2].bfill(axis=1)

print(df)

Output

     a      b  c
0  100.0  100.0  x
1    2.0  200.0  y
2    3.0  300.0  z

CodePudding user response:

Use the iloc accessor. Slice using index range to avaoid running into issues.

Sample

df1 = pd.DataFrame({'sub_name': [np.nan,'AAB','AAC','BAA','CAA','CAC','CAD','CAE','EAA', 'FAA'], 
'val_1': [2,4,8,7,4,6,2,3,8,3], 
'A':[208,208,208,210,213,213,213,213,222,223]})


df1.iloc[0:1,0].fillna(df1.iloc[0,1])




   sub_name  val_1    A
0        2      2  208
1      AAB      4  208
2      AAC      8  208
3      BAA      7  210
4      CAA      4  213
5      CAC      6  213
6      CAD      2  213
7      CAE      3  213
8      EAA      8  222
9      FAA      3  223
  •  Tags:  
  • Related