Building the output layer of NLP model (is the "embedding" layer)-CodePudding

I was looking through some notebooks in Kaggle just to get a deeper understanding of how NLP works. I came across a notebook for the natural language inference task of predicting the relationship between a given premise and hypothesis. It uses the pretrained BERT model for this task

I had a question about the build_model() function:

max_len = 50

def build_model():
   bert_encoder = TFBertModel.from_pretrained("bert-base-multilingual-cased")
   input_word_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
   input_mask = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
   input_type_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_type_ids")
   
   embedding = bert_encoder([input_word_ids, input_mask, input_type_ids])[0] # confused about this line
   output = tf.keras.layers.Dense(3, activation='softmax')(embedding[:,0,:])
   
   model = tf.keras.Model(inputs=[input_word_ids, input_mask, input_type_ids], outputs=output)
   model.compile(tf.keras.optimizers.Adam(lr=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
   
   return model

I am confused about this line: embedding = bert_encoder([input_word_ids, input_mask, input_type_ids])[0]

What does this "embedding" represent and why is there a [0] infront of the function call? Why is the bert_encoder used to instantiate this "embedding"?

Thanks in advance!

CodePudding user response：

logits

You have to put [0] in order to have torch.Tensor for computation. You can also try output.logits instead of output[0]

ps. I used AutoModelForMaskedLM, not TFBertModel. It might be little different, but just try to print out your embedding first = ]