get texts from array at indexes where 2D tensor's values are above or equal to thresholds in te-CodePudding

I have tensorflow tensor with probabilites like this:

>>> valid_4_preds
array([[0.9817431 , 0.01259811, 0.50729334, 0.00053732, 0.6966804 ,
        0.00488825],
       [0.9851129 , 0.01246135, 0.38177294, 0.00378728, 0.8398497 ,
        0.68413687],
       [0.00061161, 0.00005008, 0.00017785, 0.0000152 , 0.00017121,
        0.00002404],
       [0.9991425 , 0.23962161, 0.98579687, 0.01727398, 0.9354003 ,
        0.3325037 ]], dtype=float32)

I now need to map the above probabilties with different thresholds to classes(or a tensor with texts) and get them.

>>> # printing classes
>>> classes
<tf.Tensor: shape=(6,), dtype=string, numpy=
array([b'class_1', b'class_2', b'class_3', b'class_4', b'class_5',
       b'class_6'], dtype=object)>
>>> # converting to bools
>>> true_falses = tf.math.greater_equal(valid_4_preds, tf.constant([0.5, 0.40, 0.20, 0.80, 0.5, 0.4]))
>>> true_falses
<tf.Tensor: shape=(4, 6), dtype=bool, numpy=
array([[ True, False,  True, False,  True, False],
       [ True, False,  True, False,  True,  True],
       [False, False, False, False, False, False],
       [ True, False,  True, False,  True, False]])>

now, I am trying to get the texts at indices where true_falses has Trues(this is my expected output), like this:

>>> <some-tensorflow-operations>
<tf.Tensor: shape=(4, 6), dtype=bool, numpy=
array([['class_1', 'class_3', 'class_5'],
       ['class_1', 'class_3', 'class_5', 'class_6'],
       [],
       ['class_1', 'class_3', 'class_5']])>

Here's what I have tried:

tf.boolean_mask seems to solve the purpose, but the mask it takes in, strictly has to be 1D array.
tf.where can be used to get the indexes, output of which after reshaping to single dimension can be passed to tf.gather to get the respective classes like this:

>>> tf.gather(classes, tf.reshape(tf.where(true_falses[0] == True), shape=(-1,)))
<tf.Tensor: shape=(3,), dtype=string, numpy=array([b'class_1', b'class_3', b'class_5'], dtype=object)>

But, I haven't been able to figure out how to do this on 2D arrays.

this logic will go in a signature for serving via tensorflow-serving, so operations strictly only needs to be of tensorflow. How do I do this on 2D tensors or arrays? more efficient and quicker operations would be appreciated.

CodePudding user response：

tf.ragged.boolean_mask?

import tensorflow as tf

classes = tf.constant([b'class_1', b'class_2', b'class_3', b'class_4', b'class_5', b'class_6'])
true_falses = tf.constant([
    [ True, False,  True, False,  True, False],
    [ True, False,  True, False,  True,  True],
    [False, False, False, False, False, False],
    [ True, False,  True, False,  True, False]]
)

tf.ragged.boolean_mask(
    data=tf.tile(tf.expand_dims(classes, 0), [tf.shape(true_falses)[0], 1]),
    mask=true_falses
)
# <tf.RaggedTensor [[b'class_1', b'class_3', b'class_5'], [b'class_1', b'class_3', b'class_5', b'class_6'], [], [b'class_1', b'class_3', b'class_5']]>