metadata
library_name: pytorch
license: mit
language:
- en
tags:
- chronologically consistent
- modded-nanogpt
- hellaswag
pipeline_tag: text-generation
inference: false
ChronoGPT
Model Description
ChronoGPT is a series high-performance chronologically consistent large language models (LLMs) designed to eliminate lookahead bias and training leakage while maintaining good language understanding in time-sensitive applications. The model is pretrained on diverse, high-quality, open-source, and timestamped text to maintain chronological consistency.
All models in the series achieve HellaSwag benchmark scores that surpass those of the GPT-2 124M model with the same parameter count. This approach preserves the integrity of historical analysis and enables more reliable economic and financial modeling.
- Developed by: Songrun He, Linying Lv, Asaf Manela, Jimmy Wu
- Model type: Transformer-based autoregressive decoder (Modified modded-NanoGPT architecture)
- Language(s) (NLP): English
- License: MIT License
Model Sources
- Paper: "Chronologically Consistent Large Language Models" (He, Lv, Manela, Wu, 2025)
How to Get Started with the Model
The model is compatible with following requirements:
pip install -r requirements.txt
Here is an example code of using the model:
from modeling_chronogpt import ChronoGPT
import tiktoken
import torch
device = 'cuda:0'
max_length = 1792
tokenizer = tiktoken.get_encoding("gpt2")
model = ChronoGPT.from_pretrained("manelalab/chrono-gpt-v1-19991231", trust_remote_code=True).to(device)
text = "Obviously, the time continuum has been disrupted, creating a new temporal event sequence resulting in this alternate reality. -- Dr. Brown, Back to the Future Part II"
inputs = torch.tensor(tokenizer.encode(text))[:max_length].reshape(1,-1).to(device)
logits, emb = model(inputs)
Training Details
Training Data
- Pretraining corpus: Our initial model chrono-gpt-v1-19991231 is pretrained on 21 billion tokens of pre-2000, diverse, high-quality, and open-source text data to ensure no leakage of data afterwards.
- Incremental updates: Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.
Training Procedure
- Architecture: modded NanoGPT-based model with the Muon optimizer, Skip connections, rotary embeddings and flex attention.
- Objective: Autoregressive text generation.
Evaluation
Testing Data, Factors & Metrics
- Language understanding: Evaluated on HellaSwag benchmark tasks.
- Financial forecasting: Evaluated using return prediction task based on Dow Jones Newswire data.
- Comparison models: ChronoGPT was benchmarked against BERT, FinBERT, StoriesLM-v1-1963, and Llama 3.1.
Results
- HellaSwag Score: chrono-gpt-v1-19991231 and chrono-gpt-v1-20241231 achieved HellaSwag score of 0.295 and 0.324 respectively, outperforming GPT-2 (0.294).
- Stock return predictions: During the sample from 2008-01 to 2023-07, chrono-gpt-v1-realtime achieves a long-short portfolio Sharpe ratio of 4.50, outperforming BERT, FinBERT, and StoriesLM-v1-1963, and comparable to Llama 3.1 8B (4.90).
Citation
@article{He2025ChronoBERT,
title={Chronologically Consistent Large Language Models},
author={He, Songrun and Lv, Linying and Manela, Asaf and Wu, Jimmy},
journal={Working Paper},
year={2025}
}
Model Card Authors
- Songrun He (Washington University in St. Louis, [email protected])
- Linying Lv (Washington University in St. Louis, [email protected])
- Asaf Manela (Washington University in St. Louis, [email protected])
- Jimmy Wu (Washington University in St. Louis, [email protected])