was this finetuned after the architectural change?or, is this purely an inference-time code change?
Oh that model is finetuned after architectural change I upgraded my architecture a bit
· Sign up or log in to comment