howard-hou 's Collections

RWKV-X

RWKV-X is a family of long-context models based on RWKV-7, enhanced with Sparse Attention and capable of handling context windows up to 64K tokens.