3 11 10

Alejandro Hernández Cano

alehc

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

Benchmarking Optimizers for Large Language Model Pretraining

liked a model 29 days ago

swiss-ai/Apertus-8B-Instruct-2509

liked a model 29 days ago

swiss-ai/Apertus-70B-Instruct-2509

View all activity

Organizations

upvoted a paper 12 days ago

Benchmarking Optimizers for Large Language Model Pretraining

Paper • 2509.01440 • Published Sep 1 • 24

liked 2 models 29 days ago

swiss-ai/Apertus-8B-Instruct-2509

Text Generation • 8B • Updated 7 days ago • 248k • • 359

swiss-ai/Apertus-70B-Instruct-2509

Text Generation • 71B • Updated 7 days ago • 165k • • 163

upvoted a collection about 1 month ago

Apertus LLM

Collection

Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated 7 days ago • 280

liked 2 models about 1 month ago

swiss-ai/Apertus-70B-2509

Text Generation • 71B • Updated 5 days ago • 932 • 118

swiss-ai/Apertus-8B-2509

Text Generation • 8B • Updated 7 days ago • 16k • 131

upvoted 6 papers 7 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 49

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 297

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 418

updated a model 7 months ago

alehc/swissai-tokenizer

Updated Feb 27

published a model 7 months ago

alehc/swissai-tokenizer

Updated Feb 27

upvoted a paper 11 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 83

upvoted 2 papers about 1 year ago

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 79

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Paper • 2311.16079 • Published Nov 27, 2023 • 19

updated a dataset almost 2 years ago

alehc/rejection-sampling-QA

Viewer • Updated Jan 1, 2024 • 10 • 22

authored a paper almost 2 years ago

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Paper • 2311.16079 • Published Nov 27, 2023 • 19

liked a model almost 2 years ago

microsoft/phi-1_5

Text Generation • 1B • Updated Apr 29, 2024 • 126k • 1.35k

Alejandro Hernández Cano

AI & ML interests

Recent Activity

Organizations

alehc's activity