pavan kumar avn

pk3388

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering

upvoted an article 16 days ago

LLM based Audio models

published a model 2 months ago

pk3388/medgemma-4b-medical-qa-finetuned

View all activity

Organizations

upvoted a paper 5 days ago

CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering

Paper • 2501.01371 • Published Jan 2, 2025 • 1

upvoted an article 16 days ago

Article

LLM based Audio models

18 days ago

•

upvoted a paper 3 months ago

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6, 2025 • 72

upvoted 3 papers 4 months ago

Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15, 2025 • 28

HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering

Paper • 2509.09713 • Published Sep 8, 2025 • 24

Inpainting-Guided Policy Optimization for Diffusion Large Language Models

Paper • 2509.10396 • Published Sep 12, 2025 • 15

upvoted an article 4 months ago

Article

Jupyter Agents: training LLMs to reason with notebooks

Sep 10, 2025

•

upvoted 9 papers 4 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28, 2025 • 110

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28, 2025 • 89

upvoted 4 papers 5 months ago

Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off

Paper • 2508.04825 • Published Aug 6, 2025 • 58

Adapting Vision-Language Models Without Labels: A Comprehensive Survey

Paper • 2508.05547 • Published Aug 7, 2025 • 11

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Paper • 2508.09987 • Published Aug 13, 2025 • 25

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14, 2025 • 31

pavan kumar avn

AI & ML interests

Recent Activity

Organizations

pk3388's activity

LLM based Audio models

Jupyter Agents: training LLMs to reason with notebooks