Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shivaen Ramshetty's picture
5 65

Shivaen Ramshetty

shivr
21world's profile picture
·
  • sramshetty

AI & ML interests

NLP, CV, Multimodal

Recent Activity

upvoted a paper about 1 month ago
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
upvoted a paper 12 months ago
Tokenization Falling Short: The Curse of Tokenization
upvoted a paper 12 months ago
TroL: Traversal of Layers for Large Language and Vision Models
View all activity

Organizations

fastai X Hugging Face Group 2022's profile picture Aurora-M/MDEL's profile picture

shivr's activity

commented 4 papers about 1 year ago

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 81 •
14

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 81 •
14

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 65 •
21

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 65 •
21
New activity in shivr/gpt2-xl_local-narratives-reduced-overlap_lora over 1 year ago

Librarian Bot: Add base_model information to model

#1 opened over 1 year ago by
librarian-bot
New activity in shivr/gpt2-xl_grit_and_local-narratives_lora over 1 year ago

Librarian Bot: Add base_model information to model

#1 opened over 1 year ago by
librarian-bot
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs