I have tensorflow tensor with probabilites like this:
>>> valid_4_preds
array([[0.9817431 , 0.01259811, 0.50729334, 0.00053732, 0.6966804 ,
0.00488825],
[0.9851129 , 0.01246135, 0.38177294, 0.00378728, 0.8398497 ,
0.68413687],
[0.00061161, 0.00005008, 0.00017785, 0.0000152 , 0.00017121,
0.00002404],
[0.9991425 , 0.23962161, 0.98579687, 0.01727398, 0.9354003 ,
0.3325037 ]], dtype=float32)
I now need to map the above probabilties with different thresholds to classes(or a tensor with texts) and get them.
>>> # printing classes
>>> classes
<tf.Tensor: shape=(6,), dtype=string, numpy=
array([b'class_1', b'class_2', b'class_3', b'class_4', b'class_5',
b'class_6'], dtype=object)>
>>> # converting to bools
>>> true_falses = tf.math.greater_equal(valid_4_preds, tf.constant([0.5, 0.40, 0.20, 0.80, 0.5, 0.4]))
>>> true_falses
<tf.Tensor: shape=(4, 6), dtype=bool, numpy=
array([[ True, False, True, False, True, False],
[ True, False, True, False, True, True],
[False, False, False, False, False, False],
[ True, False, True, False, True, False]])>
now, I am trying to get the texts at indices where true_falses has Trues(this is my expected output), like this:
>>> <some-tensorflow-operations>
<tf.Tensor: shape=(4, 6), dtype=bool, numpy=
array([['class_1', 'class_3', 'class_5'],
['class_1', 'class_3', 'class_5', 'class_6'],
[],
['class_1', 'class_3', 'class_5']])>
Here's what I have tried:
tf.boolean_maskseems to solve the purpose, but themaskit takes in, strictly has to be 1D array.tf.wherecan be used to get the indexes, output of which after reshaping to single dimension can be passed totf.gatherto get the respective classes like this:
>>> tf.gather(classes, tf.reshape(tf.where(true_falses[0] == True), shape=(-1,)))
<tf.Tensor: shape=(3,), dtype=string, numpy=array([b'class_1', b'class_3', b'class_5'], dtype=object)>
But, I haven't been able to figure out how to do this on 2D arrays.
this logic will go in a signature for serving via tensorflow-serving, so operations strictly only needs to be of tensorflow. How do I do this on 2D tensors or arrays? more efficient and quicker operations would be appreciated.
CodePudding user response:
import tensorflow as tf
classes = tf.constant([b'class_1', b'class_2', b'class_3', b'class_4', b'class_5', b'class_6'])
true_falses = tf.constant([
[ True, False, True, False, True, False],
[ True, False, True, False, True, True],
[False, False, False, False, False, False],
[ True, False, True, False, True, False]]
)
tf.ragged.boolean_mask(
data=tf.tile(tf.expand_dims(classes, 0), [tf.shape(true_falses)[0], 1]),
mask=true_falses
)
# <tf.RaggedTensor [[b'class_1', b'class_3', b'class_5'], [b'class_1', b'class_3', b'class_5', b'class_6'], [], [b'class_1', b'class_3', b'class_5']]>
