image/png

Maxwell Instruction Complexity Estimator (MICE)

Model Version License Downloads

A fast, efficient, and accurate instruction complexity scorer powered by ModernBERT-Large. MICE predicts normalized task difficulty scores (0–1) for English instructions, with an easy option to rescale to custom ranges.


πŸš€ Features

  • Lightweight & Fast: Leverages a compact backbone (ModernBERT-Large + LoRA) with only 14.4M trainable parameters.
  • Data-Driven: Trained on 66.5K English instruction–score pairs from the DEITA-Complexity dataset.
  • High Fidelity: Matches the performance of models 34Γ— larger on standard complexity benchmarks.
  • Flexible Scoring: Outputs normalized scores (0–1) by default, with optional denormalization to any range (e.g., [1–6], [0–100]).

πŸ”§ Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "thethinkmachine/Maxwell-Task-Complexity-Scorer-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 1. Get normalized complexity (0–1)
def get_normalized_score(text: str) -> float:
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        logits = model(**inputs).logits.squeeze()
    return float(logits)

# 2. Denormalize to [min_score, max_score]
def get_denormalized_score(text: str, min_score: float = 1, max_score: float = 6) -> float:
    norm = get_normalized_score(text)
    raw = norm * (max_score - min_score) + min_score
    return float(round(raw, 2))

# Example
query = "Is learning equivalent to decreasing local entropy?"
print("Normalized:", get_normalized_score(query))
print("Evol-Complexity [1–6]:", get_denormalized_score(query))

πŸ“– Model Details

  • Architecture: ModernBERT-Large backbone with LoRA adapters (rank 32, alpha 64, dropout 0.1).
  • Task: Sequence Classification.
  • Languages: English.
  • Training Data: 66,500 instruction–score pairs from [BhabhaAI/DEITA-Complexity].
  • Normalization: Min–max scaled to [0,1]; denormalization recommended via score * (max - min) + min.

Data Distribution

Original Score Count %
1 8,729 13.3%
2 5,399 8.2%
3 10,937 16.7%
4 9,801 15.0%
5 24,485 37.4%
6 6,123 9.3%

Outliers (0,7–9) were pruned (<1% of data).


βš™οΈ Training Configuration

  • Optimizer: AdamW (lr=5e-5, weight decay=0.01)
  • Batch Size: 8
  • Epochs: 3
  • Max Seq. Length: 512
  • Warmup: 10% of total steps
  • Compute: 50.3M tokens, TTP ratio β‰ˆ3.5

🌱 Environmental Impact

  • Compute Used: 16h on 1Γ— NVIDIA L4 GPU (72W TDP) in GCP asia-south1.
  • COβ‚‚ Emissions: 0.87β€―kgβ€―COβ‚‚eq (fully offset).
  • Estimator: ML COβ‚‚ Impact Calculator.

πŸ” Bias & Limitations

  • Domain Bias: Trained primarily on general English; may underperform on technical/coding/math instructions.
  • Language: English-only.
  • Scaling Caution: Denormalization preserves ordering but absolute values depend on chosen range.

πŸ“š Citation

If you use MICE in your research, please cite:

Chaubey, S. (2024). Maxwell Instruction Complexity Estimator (MICE). https://huggingface.co/thethinkmachine/MICE


πŸ™‹β€β™‚οΈ Author & Contact

Shreyan C (thethinkmachine) Email: [email protected]

This project is licensed under the Apache 2.0 License.

Downloads last month
31
Safetensors
Model size
396M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thethinkmachine/MICE

Finetuned
(116)
this model

Dataset used to train thethinkmachine/MICE

Space using thethinkmachine/MICE 1