Update README.md

fix: Correct multi-label head size during model initialization

The script was failing on `model.load_state_dict()` with a `RuntimeError`
due to a size mismatch in the `multilabel_classifier` layer.

The root cause was an incorrect calculation of the multi-label head's
output dimension. The code was including the 3 `sentiment` labels when
calculating the size for the multi-label head, resulting in an expected
shape of [44, 768] instead of the correct [41, 768] from the checkpoint.

This commit corrects the logic in `load_essentials()` by explicitly
excluding the 'sentiment' task from the `multiclass_tasks` calculation.
This ensures the in-memory model architecture matches the saved weights,
resolving the loading error.

Files changed (1) hide show

README.md +13 -4

README.md CHANGED Viewed

@@ -172,17 +172,25 @@ def get_sentiment_labels() -> Dict[int, str]: return {0: 'negative', 1: 'neutral
 3.  **Setup & Loading**: This setup function handles loading all components and reconstructing the necessary metadata.
 ```python
 def load_essentials():
     hub_repo_id = "spencercdz/xlm-roberta-sentiment-requests"
     subfolder = "final_model"
     device = "cuda" if torch.cuda.is_available() else "cpu"
     all_labels_map = get_all_labels()
     binary_tasks = [k for k, v in all_labels_map.items() if len(v) == 2 and k not in ['related', 'sentiment']]
-    multiclass_tasks = {k: len(v) for k, v in all_labels_map.items() if len(v) > 2}
     column_names = [f"{t}_{i}" for t, n in multiclass_tasks.items() for i in range(n)] + binary_tasks
     multilabel_column_names = sorted(column_names)
-    num_multilabels = len(multilabel_column_names)
     num_sentiment_labels = len(get_sentiment_labels())
     tokenizer = AutoTokenizer.from_pretrained(hub_repo_id, subfolder=subfolder)
@@ -191,7 +199,7 @@ def load_essentials():
     model_shell = MultiHeadClassificationModel(config=config, num_multilabels=num_multilabels)
     weights_path = hf_hub_download(repo_id=hub_repo_id, filename="model.safetensors", subfolder=subfolder)
-    state_dict = load_file(weights_path, device=device)
     model_shell.load_state_dict(state_dict, strict=False)
     model = model_shell.to(device)
     model.eval()
@@ -201,6 +209,7 @@ def load_essentials():
         "multilabel_column_names": multilabel_column_names,
         "all_labels": all_labels_map, "device": device
     }
     return model, tokenizer, metadata
 ```
 ***
@@ -281,7 +290,7 @@ The following hyperparameters were used during training:
 ### Training results
-The final results on the evaluation set are based on the best checkpoint at epoch 594. A truncated history of the 50 most important rows are shown below.
 For the full data, please refer to [training_log.csv](https://huggingface.co/spencercdz/xlm-roberta-sentiment-requests/blob/main/training_log.csv) in the repository.
 | Training Loss | Epoch | Step    | Validation Loss | F1 Micro | F1 Macro | Subset Accuracy |

 3.  **Setup & Loading**: This setup function handles loading all components and reconstructing the necessary metadata.
 ```python
 def load_essentials():
+    print("Loading model, tokenizer, and metadata... (This may take a moment on first run)")
     hub_repo_id = "spencercdz/xlm-roberta-sentiment-requests"
     subfolder = "final_model"
     device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"Using device: {device}")
     all_labels_map = get_all_labels()
+    # --- FIX IS HERE ---
+    # We must exclude 'sentiment' from the multiclass tasks for the multi-label head,
+    # because sentiment has its own dedicated classification head.
+    multiclass_tasks = {k: len(v) for k, v in all_labels_map.items() if len(v) > 2 and k != 'sentiment'}
+    # -------------------
     binary_tasks = [k for k, v in all_labels_map.items() if len(v) == 2 and k not in ['related', 'sentiment']]
     column_names = [f"{t}_{i}" for t, n in multiclass_tasks.items() for i in range(n)] + binary_tasks
     multilabel_column_names = sorted(column_names)
+    num_multilabels = len(multilabel_column_names) # This will now correctly be 41
     num_sentiment_labels = len(get_sentiment_labels())
     tokenizer = AutoTokenizer.from_pretrained(hub_repo_id, subfolder=subfolder)
     model_shell = MultiHeadClassificationModel(config=config, num_multilabels=num_multilabels)
     weights_path = hf_hub_download(repo_id=hub_repo_id, filename="model.safetensors", subfolder=subfolder)
+    state_dict = load_file(weights_path, device="cpu") # Load to CPU first
     model_shell.load_state_dict(state_dict, strict=False)
     model = model_shell.to(device)
     model.eval()
         "multilabel_column_names": multilabel_column_names,
         "all_labels": all_labels_map, "device": device
     }
+    print("Loading complete.")
     return model, tokenizer, metadata
 ```
 ***
 ### Training results
+The final results on the evaluation set are based on the best checkpoint at epoch 594. A truncated history of the 25 most important rows are shown below.
 For the full data, please refer to [training_log.csv](https://huggingface.co/spencercdz/xlm-roberta-sentiment-requests/blob/main/training_log.csv) in the repository.
 | Training Loss | Epoch | Step    | Validation Loss | F1 Micro | F1 Macro | Subset Accuracy |