I have an (example) dataframe that looks like this:
time event type
0 2022-01-22 10:35:00 a
1 2022-01-22 11:37:00 a
2 2022-01-22 22:22:00 b
3 2022-01-22 12:05:00 b
4 2022-01-22 10:09:00 c
5 2022-01-22 10:57:00 a
6 2022-01-22 11:36:00 c
7 2022-01-22 09:45:00 a
I would like to create a 3D surface plot that shows how many event from each type occur per hour. The axis of the plot should be:
X: hour
Y: event type
Y: number of events
I would expect to see on the X axis:9, 10, 11, 12 ,22, on theY axis: a, b, c. As for the Z axis,the values should reflect the number of event per type per hour. E.g. X=10, Y=a, Z=2
I looked at the documentation and various examples, but could not find and answer
CodePudding user response:
This requires two tasks. First, you have to aggregate your data to count unique pairs hour-event type, then create the 3D plot from the aggregated hour-event type-event count data:
from matplotlib import pyplot as plt
from matplotlib.ticker import MaxNLocator
import pandas as pd
import numpy as np
#test data
np.random.seed(123)
n = 10
start = pd.to_datetime("2021-04-21")
end = pd.to_datetime("2021-04-23")
n_minut = ((end - start).days 1) * 24 * 60
date_range = pd.to_timedelta(np.random.randint(0, n_minut, n), unit="minute") start
df = pd.DataFrame({"time": date_range, "event type": np.random.choice(list("abcde"), n)})
#count event types per hour
plot_df = df. groupby([df["time"].dt.hour, df["event type"]]).size().reset_index(name="event count")
#transcribe categorical data in column "event type" into integer values
#idx contains the list of event types according to their integer numbers
val, idx = plot_df["event type"].factorize()
plot_df["event_num"] = val
#generate evenly spaced x- and y-values
x_range = np.arange(24)
y_range = np.arange(idx.size)
#and create x-y arrays for the 3D plot
X, Y = np.meshgrid(x_range, y_range)
#and fill z-values with zeros
Z = np.zeros(X.shape)
#or the event count, if exists
Z[plot_df["event_num"], plot_df["time"]] = plot_df["event count"]
#create figure with a 3D projection axis
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.plot_surface(X, Y, Z)
ax.zaxis.set_major_locator(MaxNLocator(integer=True))
ax.set_yticks(y_range, idx)
ax.set_ylabel("event type")
ax.set_xlabel("time (in h)")
ax.set_zlabel("count")
plt.show()
Sample output:
However, it is well-known that matplotlib has sometimes problems plotting surfaces in the correct order of visibility. Depending on your data, you might be better off with a scatter plot:
...
X, Y = np.meshgrid(x_range, y_range)
Z = np.full(X.shape, np.nan)
Z[plot_df["event_num"], plot_df["time"]] = plot_df["event count"]
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.scatter(X, Y, Z)
ax.zaxis.set_major_locator(MaxNLocator(integer=True))
...


