Found input variables with inconsistent numbers of samples: [3070, 6140]-CodePudding

first, I divide data like this, I reshape it because before it was an error like this "logits and labels must have the same shape ((?, 1) vs (?,2))"

 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
 y_train = np.asarray(y_train).astype('float32').reshape((-1,1))
 y_test = np.asarray(y_test).astype('float32').reshape((-1,1))
 print(X_test.shape)
 print(X_train.shape)
 print(y_train.shape)
 print(y_test.shape)

after that, I want build model BiLSTM.

LSTMmodel = keras.Sequential()
LSTMmodel.add(Embedding(14301, 32, input_length=X_train.shape[1], name="embedding"))
LSTMmodel.add(Bidirectional(LSTM(64, return_sequences=True), backward_layer=LSTM(64, return_sequences=True, go_backwards=True)))
LSTMmodel.add(Dropout(0.2))
LSTMmodel.add(Bidirectional(LSTM(64)))
LSTMmodel.add(Dropout(0.2))
LSTMmodel.add(Dense(1, activity_regularizer=l2(0.002)))
LSTMmodel.add(Activation('sigmoid'))

lr_schedule = keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=1e-2,
    decay_steps=10000,
    decay_rate=0.9)
opt = keras.optimizers.Adam(learning_rate=lr_schedule)
LSTMmodel.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
LSTMmodel.summary()

LSTMmodel.save('modelbilstm.h5')
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=8)
mc = ModelCheckpoint('modelbilstm.h5', monitor='val_accuracy', mode='max', verbose=1, save_best_only=True)
LSTMhistory=LSTMmodel.fit(X_train, y_train , epochs=20, batch_size=128, validation_split=0.2, verbose=0, callbacks=[es,mc])
LSTMhistory.history

After that, I want to know word embedding layer.

weights = LSTMmodel.get_layer('embedding').get_weights()[0]
weights[:5]

But when I want to know the confusion matrix, I have an error.

saved_model_bilstm = load_model('modelbilstm.h5')

y_predLSTM =(saved_model_bilstm.predict(X_test) > 0.5).astype("int32")
print (confusion_matrix(y_predLSTM, y_test))
print (classification_report(y_predLSTM, y_test))

What could be going wrong?

CodePudding user response：