Raja Biswas's picture

Raja Biswas

rbiswasfc

·

AI & ML interests

NLP, Generative AI

Recent Activity

updated a model 1 day ago

rbiswasfc/models

published a model 2 days ago

rbiswasfc/models

updated a dataset 6 days ago

rbiswasfc/zotero-answer-ai-texts

View all activity

Organizations

rbiswasfc's activity

upvoted 2 articles 7 days ago

Article

Vision Language Models (Better, Faster, Stronger)

By

and 4 others •

26 days ago

• 417

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

By

and 6 others •

17 days ago

• 140

upvoted an article 10 days ago

Article

The case for specialized pre-training: ultra-fast foundation models for dedicated tasks

By

•

Aug 4, 2024

• 30

upvoted a paper 14 days ago

Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval

Paper • 2505.16967 • Published 15 days ago • 22

upvoted a collection about 1 month ago

Text to SVG papers

7 items • Updated Apr 30, 2024 • 7

upvoted 3 papers about 1 month ago

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30 • 56

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29 • 55

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29 • 94

upvoted a paper about 2 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92

upvoted an article about 2 months ago

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

By

and 5 others •

Feb 4

• 89

upvoted a collection 2 months ago

SigLIP2

36 items • Updated 8 days ago • 74

upvoted 4 papers 2 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 115

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 190

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 128

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 156

upvoted 2 collections 2 months ago

RLVR

Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' • 3 items • Updated Mar 31 • 11

ReSearch

Trained models as described in the paper "ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning" • 5 items • Updated Mar 27 • 6

upvoted a paper 3 months ago

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 165

upvoted a collection 3 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 238

upvoted an article 3 months ago

Article

The N Implementation Details of RLHF with PPO

By

and 2 others •

Oct 24, 2023

• 58