Please consider a panda dataframe final_df with 142457 rows correctly indexed:
0
1
2
3
4
...
142452
142453
142454
142455
142456
I create / sample a new df data_test_for_all_models from this one:
data_test_for_all_models = final_df.copy().sample(frac=0.1, random_state=786)
A few indexes:
2235
118727
23291`
Now I drop rows from final_df with indexes in data_test_for_all_models :
final_df = = final_df.drop(data_test_for_all_models.index)
If I check a few indexes present in final_df :
final_df.iloc[2235]
returns wrongly a row.
I think it's a problem of reset indexes but which function does it: drop(), sample()?
Thanks.
CodePudding user response:
You are using .iloc which provides integer-based indexing. You are getting the row number 2235, not the row with index 2235.
For that, you should use .loc:
final_df.loc[2235]
And you should get a KeyError.
