implementing Custom Loss Function Not Improving Validation Loss in TensorFlow 2.12
After trying multiple solutions online, I still can't figure this out. I'm getting frustrated with I'm confused about I just started working with I've looked through the documentation and I'm still confused about I'm working with a perplexing question with my custom loss function in TensorFlow 2.12..... I've implemented a custom loss function that aims to improve the validation loss, but I noticed that the validation loss remains stagnant and does not decrease throughout training. I created this custom loss to prioritize recall, as my dataset is imbalanced. Here's the code snippet for my custom loss function: ```python import tensorflow as tf def custom_loss(y_true, y_pred): # Only focusing on positive class for recall calculation true_positives = tf.reduce_sum(tf.round(tf.clip_by_value(y_true * y_pred, 0, 1))) false_negatives = tf.reduce_sum(tf.round(tf.clip_by_value(y_true * (1 - y_pred), 0, 1))) recall = true_positives / (true_positives + false_negatives + tf.keras.backend.epsilon()) return 1 - recall # Minimizing 1 - recall ``` I compiled the model like this: ```python model.compile(optimizer='adam', loss=custom_loss, metrics=['accuracy']) ``` During training, I noticed that the training accuracy improves, but the validation loss holds steady around 0.3, which is quite strange. I've tried adjusting the learning rate, using different optimizers, and even normalizing my input data, but the validation loss still does not budge. Hereโs how Iโm fitting the model: ```python history = model.fit(train_dataset, epochs=50, validation_data=val_dataset) ``` The training dataset has 20,000 samples with a 90:10 split for positive and negative classes. The validation set has 5,000 samples with a similar imbalance. I also checked the shape of y_true and y_pred to ensure they match. Any insights on why my custom loss function doesnโt seem to be improving the validation loss? Could there be an scenario with the way I'm calculating recall, or is there something else Iโm missing? This is part of a larger service I'm building. Has anyone else encountered this? My development environment is Ubuntu 22.04. Thanks for any help you can provide! I'm working with Python in a Docker container on CentOS. Any help would be greatly appreciated! What would be the recommended way to handle this?