Hi! I've read the original PI paper. It says they only fine-tune about 1000 steps to extend the context window. Did you tune the same steps (i.e. 1000 steps) as the original paper? Thanks!
· Sign up or log in to comment