Unexpected Overfitting in TensorFlow with Custom Callback Implementation

👀 Views: 15 💬 Answers: 1 📅 Created: 2025-06-05

tensorflow keras machine-learning Python

I've hit a wall trying to I keep running into I'm stuck trying to I've been struggling with this for a few days now and could really use some help. I'm using TensorFlow 2.10 and Keras to build a neural network for a classification problem, but I'm facing unexpected overfitting despite implementing early stopping with a custom callback. My model's training accuracy is approaching 98%, while validation accuracy plateaus around 75%. I tried to address this by adding dropout layers, but the issue persists. Here’s my model architecture: ```python import tensorflow as tf from tensorflow.keras import layers, models model = models.Sequential([ layers.Dense(128, activation='relu', input_shape=(input_shape,)), layers.Dropout(0.5), layers.Dense(64, activation='relu'), layers.Dropout(0.5), layers.Dense(num_classes, activation='softmax') ]) ``` I've also implemented a custom callback for early stopping, which looks like this: ```python class CustomEarlyStopping(tf.keras.callbacks.Callback): def __init__(self, patience=3): super(CustomEarlyStopping, self).__init__() self.patience = patience self.best_val_loss = float('inf') self.wait = 0 def on_epoch_end(self, epoch, logs=None): current_val_loss = logs.get('val_loss') if current_val_loss < self.best_val_loss: self.best_val_loss = current_val_loss self.wait = 0 else: self.wait += 1 if self.wait >= self.patience: self.model.stop_training = True print('Stopping training early. Validation loss did not improve.') ``` While training, I monitor the loss: ```python model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=50, callbacks=[CustomEarlyStopping(patience=5)]) ``` I have tried increasing the dropout rates and augmenting the dataset, but the validation performance still doesn’t improve. I also checked for data leakage and confirmed that the validation set is correctly separated. Could there be any issues with how the custom callback is integrated, or should I consider other strategies to mitigate overfitting? What other best practices should I implement at this stage? For context: I'm using Python on Windows. Thanks in advance! I'm using Python 3.9 in this project. My team is using Python for this mobile app. What would be the recommended way to handle this? I'm working with Python in a Docker container on Debian. Any feedback is welcome!