I have a pandas dataframe, "tracks", that I'm filtering for erroneous altitude information. When the altitude is below a certain criteria, I want to throw out all rows that start with the same track_key. In the example, N123P, on track_key 4xuut, has an erroneous altitude, so I want to remove ALL rows that start with "4xuut", but NOT the rows below them that have the same call sign.
| track_key | callsign | aircraft_type | speed | altitude |
|---|---|---|---|---|
| 4xuut | N123P | C550 | 300 | -1 |
| 4xuut | N123P | C550 | 297 | 15 |
| 4yt06 | N123P | C550 | 305 | 1022 |
| 4yt06 | N123P | C550 | 301 | 1028 |
| 4xx21 | N348U | GALX | 350 | 1025 |
I've tried this:
tracks = tracks[tracks.track_key != tracks.loc[tracks['altitude'].astype('float') <= field_elev, 'track_key'].iloc[0]], but it only seems to work on the first match (there can be several), or, if there are no matches, I get an "out-of-bounds" error.
CodePudding user response:
Try this.
tracks[tracks.groupby('track_key').transform('min')['altitude']>0]
output
track_key callsign aircraft_type speed altitude
2 4yt06 N123P C550 305 1022
3 4yt06 N123P C550 301 1028
4 4xx21 N348U GALX 350 1025
Thanks to @bkeesey for this solution.
CodePudding user response:
The reason you see the error, out of bounds is because there is no value to access with an index 0 if there is no erroneous altitude value.
To solve the issue, I used if condition, as follows:
import pandas as pd
tracks = pd.DataFrame({
'track_key': ['4xuut', '4xuut', '4yt06', '4yt06', '4xx21'],
'callsign': ['N123P', 'N123P', 'N123P', 'N123P', 'N348U'],
'aircraft_type': ['C550', 'C550', 'C550', 'C550', 'GALX'],
'speed': [300, 297, 305, 301, 350],
'altitude': [-1, 15, 1022, 1028, 1025],
})
# track_key callsign aircraft_type speed altitude
#0 4xuut N123P C550 300 -1
#1 4xuut N123P C550 297 15
#2 4yt06 N123P C550 305 1022
#3 4yt06 N123P C550 301 1028
#4 4xx21 N348U GALX 350 1025
erroneous = -1
key_to_delete = tracks[tracks['altitude'] == erroneous]['track_key'].values
if len(key_to_delete) > 0:
tracks = tracks[~tracks['track_key'].str.startswith(key_to_delete[0])]
print(tracks)
# track_key callsign aircraft_type speed altitude
#2 4yt06 N123P C550 305 1022
#3 4yt06 N123P C550 301 1028
#4 4xx21 N348U GALX 350 1025
