CodexBloom - Programming Q&A Platform

implementing Fine-Tuning BERT for Text Classification using Hugging Face Transformers

👀 Views: 42 đŸ’Ŧ Answers: 1 📅 Created: 2025-06-07
huggingface transformers bert machine-learning Python

I'm deploying to production and I'm currently attempting to fine-tune a BERT model for a multi-class text classification question using the Hugging Face Transformers library (version 4.21.1)... My dataset consists of around 10,000 labeled texts across five categories. I followed the example in the documentation, but I'm running into some unexpected behavior during training. Specifically, the accuracy seems to plateau quickly, and the model isn't learning effectively. I've implemented the following code: ```python from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments from datasets import load_dataset # Load dataset dataset = load_dataset('my_dataset_name') # make sure this is correctly set up # Initialize tokenizer and model tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=5) # Preprocess the dataset def preprocess_function(examples): return tokenizer(examples['text'], truncation=True, padding=True) tokenized_dataset = dataset.map(preprocess_function, batched=True) # Set training arguments training_args = TrainingArguments( output_dir='./results', evaluation_strategy='epoch', learning_rate=2e-5, per_device_train_batch_size=16, num_train_epochs=3, weight_decay=0.01, ) # Initialize Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset['train'], eval_dataset=tokenized_dataset['test'] ) # Start training trainer.train() ``` During the training, I noticed that the loss decreases but the accuracy remains around 30% after several epochs. I've tried adjusting the learning rate to 5e-5 and increasing the number of epochs to 5, but the results are still not improving. Additionally, I receive the following warning: ``` UserWarning: The `labels` argument is not required when `compute_loss` is set to False (which may be the case when evaluating). ``` Is there something wrong with how I'm setting up the training? Should I explicitly set `compute_loss=True`? Or is there an scenario with my dataset? Any pointers on how to troubleshoot this would be greatly appreciated! My development environment is macOS. I'm developing on Debian with Python. Thanks for any help you can provide!