Sayantan Das's picture

Sayantan Das

ucalyptus

·

https://ucalyptus.me/

AI & ML interests

Generative Modeling

Recent Activity

liked a model 1 day ago

premai-io/Prem-Cardiology

liked a model 12 days ago

deepseek-ai/DeepSeek-R1-0528

upvoted a paper 14 days ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

View all activity

Organizations

ucalyptus's activity

upvoted a paper 14 days ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published 16 days ago • 41

upvoted a collection 18 days ago

Sanskrit

collection of all Sanskrit text, currently at 115K samples • 8 items • Updated 17 days ago • 11

upvoted an article about 2 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

By

and 14 others •

Dec 19, 2024

• 647

upvoted a paper about 2 months ago

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 50

upvoted 2 articles about 2 months ago

Article

Open R1: Update #3

By

and 9 others •

Mar 11

• 292

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

By

•

Apr 18

• 37

upvoted a paper about 2 months ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 108

upvoted a collection 3 months ago

Tessa-T1 REACT REASONING MODEL

Tessa-T1 is a model that generates Stateful React with tailwind styling. It has features of other libraries as well. It is based on Qwen2.5-Coder. • 5 items • Updated Mar 24 • 8

upvoted 3 papers 3 months ago

Motion Anything: Any to Motion Generation

Paper • 2503.06955 • Published Mar 10 • 34

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 88

SynCity: Training-Free Generation of 3D Worlds

Paper • 2503.16420 • Published Mar 20 • 25

upvoted an article 3 months ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5, 2024

• 257

upvoted a collection 3 months ago

SLM Judge Models

Base model(s) merged with the specific evaluation task adapter. Each model performs excellently for its purpose and remains useful for general tasks. • 6 items • Updated Feb 18 • 1

upvoted 3 articles 3 months ago

Article

Fine-tuning SmolLM with Group Relative Policy Optimization (GRPO) by following the Methodologies

By

•

Feb 17

• 22

Article

🌁#90: Why AI’s Reasoning Tests Keep Failing Us

By

•

Mar 3

• 9

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

By

and 3 others •

Mar 12

• 430

upvoted a paper 3 months ago

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Paper • 2503.04872 • Published Mar 6 • 15

upvoted a collection 3 months ago

NuExtract-1.5

4 items • Updated Nov 15, 2024 • 7

upvoted 2 articles 3 months ago

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

By

and 5 others •

Feb 4

• 90

Article

How to deploy and fine-tune DeepSeek models on AWS

By

and 2 others •

Jan 30

• 52