为什么计算softmax之前要将logits转为float?
#10 opened 9 months ago
by
yuanshuai
how did you guys pretrain the tokenizer using tiktoken ?
#9 opened 9 months ago
by
StephennFernandes

是否可以运行在两张不同型号的GPU上
#8 opened 12 months ago
by
XCZDH
Adding Evaluation Results
#7 opened 12 months ago
by
leaderboard-pr-bot

On how much English token was the model trained onn
3
#5 opened about 1 year ago
by
aslawliet

_set_gradient_checkpointing() got an unexpected keyword argument 'enable'
2
#3 opened about 1 year ago
by
ehartford
