Updated the docs of the python package as we now officially support passing attn_implementation too. (#11) defe16c verified AshwinSankar psidharth567 commited on Jun 18
Completely overhauled the attention implementation. Using the existing Gemma-3 attention implementation rather than custom monkey-patched implementation. (#10) 17d96ff verified AshwinSankar psidharth567 commited on Jun 18
That previous logo from the link did not render correctly (#6) 33120e3 verified AshwinSankar psidharth567 commited on Jun 5
Added arXiv logo beside the paper name. (#5) bf49998 verified AshwinSankar psidharth567 commited on Jun 5