Yash Marathe's picture

Yash Marathe

yashmarathe

·

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

deca-ai/3-alpha-ultra

liked a model 3 days ago

meituan-longcat/LongCat-Flash-Chat

liked a model 4 days ago

pinecone/ConstBERT

View all activity

Organizations

upvoted 3 collections about 2 months ago

SuperBPE

SuperBPE tokenizers and models trained with them • 8 items • Updated Apr 10 • 15

💧 LFM2

LFM2 is a new generation of hybrid models, designed for on-device deployment. • 15 items • Updated 6 days ago • 94

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 60 items • Updated Jul 7 • 12

upvoted 3 collections 3 months ago

Avey 1 Research Preview

1.5B preview models trained on 100B tokens of FineWeb, and an instruct-tuned version on smoltalk. • 3 items • Updated Jun 16 • 6

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 161

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated Jul 31 • 53

upvoted 3 collections 4 months ago

LipSync and Face Operations

22 items • Updated 9 days ago • 56

Perception LM

7 items • Updated Apr 17 • 61

Perception Encoder

17 items • Updated Jul 11 • 67

upvoted 3 collections 5 months ago

Skywork-OR1

Skywork Open Reasoner 1 • 11 items • Updated May 29 • 31

Kimina Prover Preview

State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28 • 33

Kimi-VL-A3B

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 75

upvoted an article 6 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

By

•

May 7, 2024

• 96

upvoted 2 collections 6 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated about 18 hours ago • 297

L1

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 7 items • Updated Jul 13 • 7

upvoted a collection 7 months ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Jul 14 • 63

upvoted an article 7 months ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

By

and 3 others •

Feb 4

• 172

upvoted 2 papers 8 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 284

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 100

upvoted a collection 9 months ago

Toy Models to Study

9 items • Updated Mar 17, 2024 • 2