For the DataFrame df below
df = pd.DataFrame([('Tesla','Model3', '-', 'Motor'),
('Tesla', 'ModelS', '-', 'MotorMD3'),
('Tesla', 'ModelX', '-', 'MotorMD3'),
('Tesla', 'ModelY', '-', 'Motor'),
('Jeep', 'Wrangler','Grouped','Engine'),
('Jeep', 'Compass', 'Grouped','EngineMD3'),
('Jeep', 'Patriot', 'Grouped','Engine'),
('Jeep', 'Cherokee','Grouped','Engine'),
('Ford', 'Mustang', 'Grouped','Engine'),
('Ford', 'F150', 'Grouped','Engine') ],columns=['Make','Model','Status','Type'])
df
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped Engine
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped Engine
7 Jeep Cherokee Grouped Engine
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
I am trying to update the column Type with EngineMD3 for all same Make, if EngineMD3 is present in any of the Models in that Make, and if the Status is Grouped for that Make. But if the Status is not Grouped, Type should be kept as such for each Models. If 'EngineMD3' is not present the Type should be maintained as Engine.
For instance, Tesla is not Grouped, so each model keeps their Type the same. But Jeep is Grouped, and Compass is having its Type as EngineMD3, so EngineMD3 is updated as the Type for all Jeep Models. Ford is Grouped but none of the Models have type EngineMD3 so Type is kept as Engine
Expected output
Make Model Status Type
0 Tesla Model3 - Motor #For Tesla Type is maintained for each model seperately since it is not grouped
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3 #Since Jeep is grouped, all its Type is changed to EngineMD3 since one of the model had EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine #Even though Ford is grouped, since there is no EngineMD3 the Type is maintained as Engine.
9 Ford F150 Grouped Engine
In other words,
The conditions are for all the makes(eg. Jeeps) If the make is grouped and if FD3 is appended to any of the model types, then all the grouped models in the same make will have the FD3 appended to them
I tried to use np.select to update the Type column with multiple conditions but I couldn't give a condition to select all same Make at once and update the Type. Please do help I am running out of options here.
CodePudding user response:
df.loc[df.Status.eq('Grouped'), 'Type'] = df[df.Status.eq('Grouped')].groupby('Make').Type.transform(
lambda x: 'EngineMD3' if x.eq('EngineMD3').any() else x)
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
CodePudding user response:
You can try this:
def process(g):
if sum(g.Status=='Grouped')>0:
g['Type'] = 'EngineMD3' if sum(g.Type=='EngineMD3')>0 else 'Engine'
return g
df.groupby('Make').apply(process)
Output:
Make Model Status Type
0 Tesla Model3 - Motor
1 Tesla ModelS - MotorMD3
2 Tesla ModelX - MotorMD3
3 Tesla ModelY - Motor
4 Jeep Wrangler Grouped EngineMD3
5 Jeep Compass Grouped EngineMD3
6 Jeep Patriot Grouped EngineMD3
7 Jeep Cherokee Grouped EngineMD3
8 Ford Mustang Grouped Engine
9 Ford F150 Grouped Engine
CodePudding user response:
We can use the following function :
def trans(s_1, s_2, s_3):
Make_values, new_col = [], []
for i in range(len(s_1)):
if s_3[i]=="EngineMD3" and s_2[i]=="Grouped":
Make_values.append(s_1[i])
for i in range(len(Make_values)):
for j in range(len(s_3)):
if s_1[j]==Make_values[i]:
new_col.append("EngineMD3")
else:
new_col.append(s_3[j])
return new_col
And we apply it:
df['Type']=trans(df['Make'], df['Status'], df['Type'])
CodePudding user response:
I'm no python expert, I'm sure there are more optimal solutions... But here is one.
- Checking the length of the query > 0.
- If it is > 0, then we store the value of Make that satisfies the conditions you mentioned in y variable.
- Then we replace the Type of that specific Make.
if len(df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")) > 0:
x = df.query("Status == 'Grouped' & Type.str.startswith('EngineMD3')")['Make'].values
for i in range(len(x)):
y = x[i]
df['Type'].loc[df['Make'] == y] = 'EnginerMD3'
display(df)
