Alejandro Hernández Cano's picture

3 9 6

Alejandro Hernández Cano

alehc

·

AI & ML interests

None yet

Organizations

alehc's activity

upvoted 6 papers 2 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 50

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 291

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 391

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 229

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published Feb 7 • 44

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 186

upvoted a paper 6 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 83

upvoted 2 papers 10 months ago

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 80

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Paper • 2311.16079 • Published Nov 27, 2023 • 19