Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
nilq 's Collections
Dynamics of Transformer Language Model Features
Toy Models to Study
Merged Toy Models
Toy Base Models

Dynamics of Transformer Language Model Features

updated Oct 17, 2024
Upvote
-

  • Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

    Paper • 2203.05482 • Published Mar 10, 2022 • 6

  • Diverse Weight Averaging for Out-of-Distribution Generalization

    Paper • 2205.09739 • Published May 19, 2022 • 1

  • Fusing finetuned models for better pretraining

    Paper • 2204.03044 • Published Apr 6, 2022 • 6

  • Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

    Paper • 2309.07311 • Published Sep 13, 2023 • 4

  • Steering Llama 2 via Contrastive Activation Addition

    Paper • 2312.06681 • Published Dec 9, 2023 • 15

  • Knowledge Fusion of Large Language Models

    Paper • 2401.10491 • Published Jan 19, 2024 • 5

  • ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

    Paper • 2402.00794 • Published Feb 1, 2024 • 1

  • Resolving Interference When Merging Models

    Paper • 2306.01708 • Published Jun 2, 2023 • 14

  • Tracking Universal Features Through Fine-Tuning and Model Merging

    Paper • 2410.12391 • Published Oct 16, 2024 • 5
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs