tim-lawson 's Collections

Learning to Skip the Middle Layers of Transformers

Transformers with a novel gating mechanism that skips layers from the middle outward: https://arxiv.org/pdf/2506.21103