I have a dataframe that has some missing values. I want to replace those missing values with a value from another cell in the dataframe based on a condition. So the dataframe looks like this:
| x | a |
|---|---|
| xyz | A |
| lmn | B |
| None | A |
| xyz | A |
| qrs | C |
| None | B |
What I want to do is set the value of the "None" cell to the value in column x when the values in column a match. So that it looks like this:
| x | a |
|---|---|
| xyz | A |
| lmn | B |
| xyz | A |
| xyz | A |
| qrs | C |
| lmn | B |
The index is just sequential numbers from 0 up and may change depending on the dataset so the index for the cells with the missing information will change.
CodePudding user response:
You can use ffill() to fill forward missing values:
df['x'] = df.replace('None', np.nan).groupby('a')['x'].ffill()
print(df)
# Output:
x a
0 xyz A
1 lmn B
2 xyz A
3 xyz A
4 qrs C
5 lmn B
CodePudding user response:
for i in range(len(df)):
if df['a'][i] == 'A':
df['x'][i] = 'xyz'
This worked for me, if you want to do all the other letters, just add an elif.
