Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
Andrew
mendeza
Follow
0 followers
·
3 following
http://andrewmendez.me
AndrewMendez19
interactivetech
AI & ML interests
Computer Vision, Deep Learning
Recent Activity
upvoted
a
paper
16 days ago
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
reacted
to
akhaliq
's
post
with ❤️
20 days ago
Leave No Context Behind Efficient Infinite Context Transformers with Infini-attention https://huggingface.co/papers/2404.07143 This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block. We demonstrate the effectiveness of our approach on long-context language modeling benchmarks, 1M sequence length passkey context block retrieval and 500K length book summarization tasks with 1B and 8B LLMs. Our approach introduces minimal bounded memory parameters and enables fast streaming inference for LLMs.
upvoted
an
article
23 days ago
Tiny Agents: a MCP-powered agent in 50 lines of code
View all activity
Organizations
models
3
Sort: Recently updated
mendeza/opt-125m-synthetic-finetuned3
Text Generation
•
0.1B
•
Updated
Feb 14
•
42
mendeza/opt-125m-synthetic-finetuned
Text Generation
•
0.1B
•
Updated
Feb 13
•
8
mendeza/Llama-3.2-3B-Instruct
Text Generation
•
3B
•
Updated
Feb 12
•
10
datasets
0
None public yet