I have two different shapes of tensors generated by two models. when I print them it's like below
caption loss is (2, 128)
image loss is (128, 128)
One tensor shape is (2, 128) and the other one shape is (128, 128). The code part of these two models is below
captions_loss = keras.losses.kl_divergence(
y_true=targets, y_pred=logits, #from_logits=True
)
images_loss = keras.losses.kl_divergence(
y_true=tf.transpose(targets), y_pred=tf.transpose(logits), #from_logits=True
)
When I add these two like below then it throws an error.
return (captions_loss images_loss) / 2
Is there any solution to add these two
captions_loss = (2, 128)
images_loss = (128, 128)
CodePudding user response:
Tensors are generally also broadcastable. You can try a few options and see how they affect model performance:
import tensorflow as tf
captions_loss = tf.random.normal((2, 128))
images_loss = tf.random.normal((128, 128))
# Option 1:
(tf.reduce_sum(captions_loss, axis=0) images_loss) / 2
# Option 2:
(tf.reduce_mean(captions_loss, axis=0) images_loss) / 2
# Option 3:
(captions_loss[0, :] images_loss captions_loss[1, :]) / 2
CodePudding user response:
If you convert your matrices to numpy-arrays, you can take advantage of numpys broadcasting to compatible shapes:
import numpy as np
A = np.array([
[10, 10, 10]
])
B = np.array([
[2, 2, 2],
[3, 3, 3],
[1, 1, 1],
])
print(A.shape)
print(B.shape)
print(A B)
Output:
(1, 3)
(3, 3)
[[12 12 12]
[13 13 13]
[11 11 11]]
