ChronoGPT
Collection
A Series of Chronologically Consistent LLMs.
•
26 items
•
Updated
ChronoGPT is a series high-performance chronologically consistent large language models (LLMs) designed to eliminate lookahead bias and training leakage while maintaining good language understanding in time-sensitive applications. The model is pretrained on diverse, high-quality, open-source, and timestamped text to maintain chronological consistency.
All models in the series achieve HellaSwag benchmark scores that surpass those of the GPT-2 124M model with the same parameter count. This approach preserves the integrity of historical analysis and enables more reliable economic and financial modeling.
The model is compatible with following requirements:
pip install -r requirements.txt
Here is an example code of using the model:
from modeling_chronogpt import ChronoGPT
import tiktoken
import torch
device = 'cuda:0'
max_length = 1792
tokenizer = tiktoken.get_encoding("gpt2")
model = ChronoGPT.from_pretrained("manelalab/chrono-gpt-v1-19991231", trust_remote_code=True).to(device)
text = "Obviously, the time continuum has been disrupted, creating a new temporal event sequence resulting in this alternate reality. -- Dr. Brown, Back to the Future Part II"
inputs = torch.tensor(tokenizer.encode(text))[:max_length].reshape(1,-1).to(device)
logits, emb = model(inputs)
@article{He2025ChronoBERT,
title={Chronologically Consistent Large Language Models},
author={He, Songrun and Lv, Linying and Manela, Asaf and Wu, Jimmy},
journal={Working Paper},
year={2025}
}