I am trying to calculate the average of a column in an external file which I loaded into a data frame. I only want to calculate the average of positive numbers in the Column "DEPARTURE DELAY". To do so I thought of extending the data frame with a column only using the positive numbers and all negative ones should be replaced with a 0. Is that possible? If not are there other ways?
CodePudding user response:
Setup:
df = pd.DataFrame(
{
"CARRIER": ['9E', '9E','9E', '9E', '9E'],
"ORIGIN": ['ATL','ATL','ATL','ATL','ATL'],
"DESTINATION": ['CSG', 'CSG','CSG','CSG','CSG'],
"DEPARTURE_DELAY": [-2, -5, -5, -5, -5],
"PLANNED_DURATION": [47,47,47, 47, 47],
"ACTUAL_DURATION": [37, 32, 39, 37, 41],
"DISTANCE": [83, 83, 83, 83, 83]
}
)
As you asked for replacing all values less than 0 to 0, you can do it using
num = df._get_numeric_data() # since you have columns that are not numeric
num[num<0] = 0 # replace all negatives with 0
To get mean: df.mean(numeric_only=True)

