ChronoGPT

Model Description

ChronoGPT is a series high-performance chronologically consistent large language models (LLMs) designed to eliminate lookahead bias and training leakage while maintaining good language understanding in time-sensitive applications. The model is pretrained on diverse, high-quality, open-source, and timestamped text to maintain chronological consistency.

All models in the series achieve HellaSwag benchmark scores that surpass those of the GPT-2 124M model with the same parameter count. This approach preserves the integrity of historical analysis and enables more reliable economic and financial modeling.

  • Developed by: Songrun He, Linying Lv, Asaf Manela, Jimmy Wu
  • Model type: Transformer-based autoregressive decoder (Modified modded-NanoGPT architecture)
  • Language(s) (NLP): English
  • License: MIT License

Model Sources

  • Paper: "Chronologically Consistent Large Language Models" (He, Lv, Manela, Wu, 2025)

How to Get Started with the Model

The model is compatible with following requirements:

pip install -r requirements.txt

Here is an example code of using the model:

from modeling_chronogpt import ChronoGPT
import tiktoken
import torch

device = 'cuda:0'
max_length = 1792

tokenizer = tiktoken.get_encoding("gpt2")
model = ChronoGPT.from_pretrained("manelalab/chrono-gpt-v1-19991231", trust_remote_code=True).to(device)

text = "Obviously, the time continuum has been disrupted, creating a new temporal event sequence resulting in this alternate reality. -- Dr. Brown, Back to the Future Part II"

inputs = torch.tensor(tokenizer.encode(text))[:max_length].reshape(1,-1).to(device)
logits, emb = model(inputs)

Training Details

Training Data

  • Pretraining corpus: Our initial model chrono-gpt-v1-19991231 is pretrained on 21 billion tokens of pre-2000, diverse, high-quality, and open-source text data to ensure no leakage of data afterwards.
  • Incremental updates: Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.

Training Procedure

  • Architecture: modded NanoGPT-based model with the Muon optimizer, Skip connections, rotary embeddings and flex attention.
  • Objective: Autoregressive text generation.

Evaluation

Testing Data, Factors & Metrics

  • Language understanding: Evaluated on HellaSwag benchmark tasks.
  • Financial forecasting: Evaluated using return prediction task based on Dow Jones Newswire data.
  • Comparison models: ChronoGPT was benchmarked against BERT, FinBERT, StoriesLM-v1-1963, and Llama 3.1.

Results

  • HellaSwag Score: chrono-gpt-v1-19991231 and chrono-gpt-v1-20241231 achieved HellaSwag score of 0.295 and 0.324 respectively, outperforming GPT-2 (0.294).
  • Stock return predictions: During the sample from 2008-01 to 2023-07, chrono-gpt-v1-realtime achieves a long-short portfolio Sharpe ratio of 4.50, outperforming BERT, FinBERT, and StoriesLM-v1-1963, and comparable to Llama 3.1 8B (4.90).

Citation

@article{He2025ChronoBERT,
  title={Chronologically Consistent Large Language Models},
  author={He, Songrun and Lv, Linying and Manela, Asaf and Wu, Jimmy},
  journal={Working Paper},
  year={2025}
}

Model Card Authors

Downloads last month
12
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Collection including manelalab/chrono-gpt-v1-20241231