Why is the model size in Evo2 set to 40.3 B parameters? Is it because of the 9.3 T tokens of training data?
#2 opened 3 days ago
by
RevengeUSA
Trouble Loading Evo2 40B Model on 2x A100 GPUs
5
#1 opened 2 months ago
by
RuiHu