ModernBERT Reasoning Complexity Regressor

ModernBERT-based Reasoning Complexity Regressor

Model Description

This model predicts the reasoning complexity level (0-4) that a given web text suggests. It's fine-tuned from answerdotai/ModernBERT-base on the davanstrien/natural-reasoning-classifier dataset. The intended use for the model is in a pipeline to try and identify text that may be useful for generating reasoning data.

Reasoning Complexity Scale

The reasoning complexity scale ranges from:

0: Minimal Reasoning - Simple factual content requiring only recall
1: Basic Reasoning - Straightforward connections or single-step logical processes
2: Intermediate Reasoning - Integration of multiple factors or perspectives
3: Advanced Reasoning - Sophisticated analysis across multiple dimensions
4: Expert Reasoning - Theoretical frameworks and novel conceptual synthesis

Performance

The model achieves the following results on the evaluation set:

MSE: 0.2034
MAE: 0.2578
Spearman Correlation: 0.6963

Intended Uses

This model can be used to:

Filter and classify educational content by reasoning complexity
Identify complex reasoning problems across diverse domains
Serve as a first-stage filter in a reasoning dataset creation pipeline

Limitations

Predictions are influenced by the original dataset's domain distribution
Reasoning complexity is subjective and context-dependent

Training

The model was fine-tuned using a regression objective with the following settings:

Learning rate: 5e-05
Batch size: 16
Optimizer: AdamW
Schedule: Linear
Epochs: 10

Usage Examples

Using the pipeline API

from transformers import pipeline
pipe = pipeline("text-classification", model="davanstrien/ModernBERT-based-Reasoning-Required")

def predict_reasoning_level(text, pipe):
    # Get the raw prediction
    result = pipe(text)
    score = result[0]['score']

    # Round to nearest integer (optional)
    rounded_score = round(score)

    # Clip to valid range (0-4)
    rounded_score = max(0, min(4, rounded_score))

    # Create a human-readable interpretation (optional)
    reasoning_labels = {
        0: "No reasoning",
        1: "Basic reasoning",
        2: "Moderate reasoning",
        3: "Strong reasoning",
        4: "Advanced reasoning"
    }

    return {
        "raw_score": score,
        "reasoning_level": rounded_score,
        "interpretation": reasoning_labels[rounded_score]
    }

# Usage
text = "This argument uses multiple sources and evaluates competing perspectives before reaching a conclusion."
result = predict_reasoning_level(text, pipe)
print(f"Raw score: {result['raw_score']:.2f}")
print(f"Reasoning level: {result['reasoning_level']}")
print(f"Interpretation: {result['interpretation']}")

Using the model directly

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "davanstrien/modernbert-reasoning-complexity"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "The debate on artificial intelligence's role in society has become increasingly polarized."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)

# Get regression score
complexity_score = outputs.logits.item()
print(f"Reasoning Complexity: {complexity_score:.2f}/4.00")

davanstrien
/

ModernBERT-based-Reasoning-Required