I'm using tensorflow 2.4.0, and here's the code of the tf.keras EarlyStopping callback, in particular the method, of the EarlyStopping class, called at the end of each epoch (on_epoch_end):
def on_epoch_end(self, epoch, logs=None):
current = self.get_monitor_value(logs).
if self.monitor_op(current - self.min_delta, self.best):
self.best = current
self.wait = 0
if self.restore_best_weights:
self.best_weights = self.model.get_weights()
else:
self.wait = 1
if self.wait >= self.patience:
self.stopped_epoch = epoch
self.model.stop_training = True
if self.restore_best_weights:
if self.verbose > 0:
print('Restoring model weights from the end of the best epoch.')
self.model.set_weights(self.best_weights)
where, since, in my case, the monitored quantity is the val_loss:
self.monitor_op = np.less
In essence, the code performs this logic:
If (current - min_delta) < best:
best = current;
wait = 0
Otherwise:
wait = 1;
if wait >= patience:
stop training
Then if, for example:
min_delta = 0.1current = 0.9best = 0.85
we have that (current - min_delta) < best, thus:
best = current (=0.9)wait = 0
So, best is now associated to a worse value than the previous one (0.9 instead of 0.85); is it the correct/expected behavior of EarlyStopping?
It's seems strange
CodePudding user response:
If you take a look at the init method of the EarlyStopping class, you should see something like this:
if self.monitor_op == np.greater:
self.min_delta *= 1
else:
self.min_delta *= -1
So, since we know self.monitor_op = np.less, I think min_delta is actually -0.1 in your case and the if statement evaluation is something like: (0.9-(-0.1)) < 0.85. I am assuming you have an EarlyStopping callback defined as:
es = EarlyStopping(monitor='val_loss', mode='min')
Note also what min_delta actually is:
Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.
