How to fill NaN values based on the top and bottom strings with highest frequency-CodePudding

I have a dataframe of string values with missing values in it. It needs to be populated/filled by the below conditions.

From the NaN value index , Check the last 3 rows and next 3 rows and replace the NaN with the most frequent/repeated value out of 6 rows.
If there is 2 strings with an equal amount of frequency occurred from the last 3 rows and next 3 rows , replace the NaN with the value that has lowest index out of theses 6 rows.

My DataFrame:

     reading
0       talk
1       kill
2        NaN
3   vertical
4       type
5       kill
6        NaN
7   vertical
8   vertical
9       type
10   durable
11       NaN
12   durable
13  vertical

Expected output:

     reading
0       talk
1       kill
2       kill
3   vertical
4       type
5       kill
6   vertical
7   vertical
8   vertical
9       type
10   durable
11  vertical
12   durable
13  vertical

Here is the minimum reproducible code:

import pandas as pd
import numpy as np

df = pd.DataFrame({'reading':['talk','kill',np.NAN,'vertical','type','kill',np.NAN,'vertical','vertical','type','durable',np.NAN,'durable','vertical']}) 

def filldf(df):
    # Do the logic here
    return df

I am not sure how to approach this problem. Any help will be appreciated !!

CodePudding user response：

If you don't have too many NaN values, you can iterate over the index of NaN "reading" values and simply look for the mode of the surrounding 6 values of it (use iloc to get the first occurrence of multiple modes) and assign the values back to the corresponding "NaN" values

msk = df['reading'].isna()
df.loc[msk, 'reading'] = [df.loc[min(0, i-3):i 3, 'reading'].mode().iloc[0] for i in  df.index[msk]]

Output:

     reading
0       talk
1       kill
2       kill
3   vertical
4       type
5       kill
6   vertical
7   vertical
8   vertical
9       type
10   durable
11  vertical
12   durable
13  vertical