Although not good coding practice, I've come to an special kind of problem, in which I need to go through a column of lists to erase particular values. I suppose one resolution could be managed with melting the 'neighbors' column, but I believe the code I've managed is close from the objective. I've prepared a reproducible example for better understanding:
import pandas as pd
import numpy as np
def removing_nan_neighboors(custom_df):
nan_list = list(custom_df[custom_df['values'].notna()]['customer'])
print(nan_list)
custom_df['neighbors'] = [x for x in custom_df['neighbors'] if x not in nan_list]
return custom_df
customer = [1, 2, 3, 4, 5, 6]
values = [np.nan, np.nan, 10, np.nan, 11, 12]
neighbors = [[6, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 1]]
df = pd.DataFrame({'customer': customer, 'values': values, 'neighbors': neighbors})
df = removing_nan_neighboors(df)
print(df)
customer values neighbors
0 1 NaN [6, 2]
1 2 NaN [1, 3]
2 3 10.0 [2, 4]
3 4 NaN [3, 5]
4 5 11.0 [4, 6]
5 6 12.0 [5, 1]
The objective is to erase the customer numbers from the neighbors, if they have NaN values:
customer values neighbors
0 1 NaN [6]
1 2 NaN [3]
2 3 10.0 []
3 4 NaN [3, 5]
4 5 11.0 [6]
5 6 12.0 [5]
But I have failed to get that far, for my function doesn't work as intended yet. Help is appreciated.
CodePudding user response:
Try:
df["cust_1"] = np.where(
np.isnan(np.roll(df["values"], 1)),
np.nan,
np.roll(df["customer"], 1),
)
df["cust_2"] = np.where(
np.isnan(np.roll(df["values"], -1)),
np.nan,
np.roll(df["customer"], -1),
)
df["neighbors"] = df[["cust_1", "cust_2"]].agg(
lambda x: list(x[x.notna()].astype(int)), axis=1
)
df = df.drop(columns=["cust_1", "cust_2"])
print(df)
Prints:
customer values neighbors
0 1 NaN [6]
1 2 NaN [3]
2 3 10.0 []
3 4 NaN [3, 5]
4 5 11.0 [6]
5 6 12.0 [5]
CodePudding user response:
If I understood your objective correctly, you want to erase such numbers from every neighbors row that belong to that customer rows, where values is NaN. So basically you want to get the result from your last cell.
I attempted to do that in a list comprehension approach:
df['neighbors_new'] = [[n for n in neighbor
if n not in df[df['values'].isna() == True]['customer'].values]
for neighbor in df.neighbors]
