ChronoGPT-Instruct
ChronoGPT-Instruct is a family of chronologically consistent, instruction-following large language models (LLMs) that eliminate lookahead bias by training exclusively on time-stamped data available before a fixed knowledge-cutoff date Ο.
Each ChronoGPT-Instruct-Ο extends the ChronoGPT-Ο base models through supervised instruction fine-tuning while strictly maintaining temporal separation from all post-Ο information.
These models provide the research community with a transparent, replicable benchmark for testing lookahead-bias-free prediction in economics, finance, and other time-sensitive domains.
π Model Overview
| Property | Description | 
|---|---|
| Architecture | Transformer-decoder | 
| Parameters | β 1.55 B | 
| Layers | 52 layers | 
| Embedding dim | 1,536 | 
| Context length | 1,792 tokens | 
| Tokenizer | GPT2Tokenizer(Hugging Face) | 
| Training stage | Pretraining + Instruction Fine-tuning (SFT) | 
| License | MIT | 
| Languages | English | 
π§ Training & Data
Chronological Consistency
Each modelβs corpus satisfies chronologically consistency in both pretraining and instruction-finetuning phases. Texts dated after the model year are excluded, ensuring zero overlap with evaluation data. A GPT-4.1 classifier screens every instruction-response pair.
Instruction-Finetuning Corpus
| Stage | Source | # Examples | Avg Length | 
|---|---|---|---|
| 1 | LLMs-from-Scratch | 1 097 | 102 | 
| 2 | GPT-3 Self-Instruct | 67 136 | 183 | 
| 3 | AllenAI Tulu-3 Mixture | 356 886 | 2 513 | 
Only English, non-code entries with pre-2000 content (classifier label = 0 & confidence = 10) are retained.
We release the SFT dataset at https://huggingface.co/datasets/manelalab/ChronoInstruct-SFT.
π Usage Examples
You can try ChronoGPT-instruct directly in your browser via Google Colab:
π©βπ» Citation
@article{He_Lv_Manela_Wu_chronogpt_2025,
  title={Chronologically Consistent Generative AI},
  author={He, Songrun and Lv, Linying and Manela, Asaf and Wu, Jimmy},
  journal={Working Paper},
  year={2025}
}
- Downloads last month
- 28
