Classifier is fine-tuned from deberta-v3-base on this forecastability classification dataset to predict if Claude 3.7 Sonnet thinks a fineweb document is 'forecastable', i.e. is a useful seed for generating pastcasting questions.

Despite having a ROC AUC of .9625, only ~2% of fineweb documents are considered forecastable, so this classifier's precision/recall curves on random unseen fineweb documents look like this:

image/png

To load the model use

model = AutoModel.from_pretrained('noanabeshima/forecastability-classifier-v1')
tokenizer = AutoTokenizer.from_pretrained('noanabeshima/forecastability-classifier-v1')
Downloads last month
3
Safetensors
Model size
184M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support