I have the following DataFrame:
| y |
|---|
| NaN |
| NaN |
| 5 |
| NaN |
| 7 |
I would like to write a function that will return the number of NaN values before the first non-NaN value. Given the above example, the function should return the value 2.
I tried to solve my problem using this question, but it did not help me much.
Edit: The values always start with a NaN. If the column is all NaN, the function should return the column length.
CodePudding user response:
You could use isna to get True/1 on the NaN values and cumprod to get rid of all values that follow a non-NaN. Then sum:
df['y'].isna().cumprod().sum()
output: 2
CodePudding user response:
You can use first_valid_index.
df.y.first_valid_index()
> 2
This grabs the index of the first non-NaN value. By default we don't need to sum if the index starts from NaN.
CodePudding user response:
Use Series.isna with Series.cummin and count Trues by sum:
s = df['y'].isna().cummin().sum()
print (s)
2
