Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OxxoCodes 's Collections
Pula
Distilled Long-Context Encoders

Distilled Long-Context Encoders

updated Aug 30, 2024

Various efficient attention encoder-style architectures distilled into student models with half the hidden layers, plus a long-context NER dataset

Upvote
-

  • giant-oak/lsg-roberta-base-4096

    Fill-Mask • Updated Dec 27, 2023 • 9

  • giant-oak/distil-lsg-roberta-base-4096

    Fill-Mask • Updated Jul 10, 2023 • 8

  • giant-oak/distil-nystromformer-4096

    Fill-Mask • Updated Jun 21, 2023 • 17

  • giant-oak/distil-longformer-base-4096

    Updated Jul 8, 2023 • 6.54k

  • giant-oak/distil-bigbird-roberta-base

    Updated Jun 21, 2023

  • giant-oak/GONERD

    Viewer • Updated Dec 6, 2023 • 2.23k • 12 • 1

  • Efficient Transformer Knowledge Distillation: A Performance Review

    Paper • 2311.13657 • Published Nov 22, 2023 • 1
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs