LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention Paper • 2502.08213 • Published Feb 12 • 4 • 2