new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Dec 5

Submitted by

akhaliq

Magicoder: Source Code Is All You Need

·
5 authors

Submitted by

akhaliq

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

·
3 authors

Submitted by

akhaliq

FaceStudio: Put Your Face Everywhere in Seconds

·
6 authors

Submitted by

akhaliq

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

·
8 authors

Submitted by

akhaliq

DeepCache: Accelerating Diffusion Models for Free

·
3 authors

Submitted by

akhaliq

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

·
10 authors

Submitted by

akhaliq

Segment and Caption Anything

·
8 authors

Submitted by

akhaliq

LivePhoto: Real Image Animation with Text-guided Motion Control

·
7 authors

Submitted by

akhaliq

Nash Learning from Human Feedback

·
17 authors

Submitted by

akhaliq

Describing Differences in Image Sets with Natural Language

·
8 authors

Submitted by

akhaliq

DiffiT: Diffusion Vision Transformers for Image Generation

·
5 authors

Submitted by

akhaliq

Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

·
5 authors

Submitted by

akhaliq

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

·
7 authors

Submitted by

akhaliq

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

·
11 authors

Submitted by

akhaliq

Object Recognition as Next Token Prediction

·
6 authors

Submitted by

akhaliq

Fine-grained Controllable Video Generation via Object Appearance and Context

·
7 authors

Submitted by

akhaliq

GIVT: Generative Infinite-Vocabulary Transformers

·
3 authors

Submitted by

akhaliq

StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D

·
10 authors

Submitted by

akhaliq

Training Chain-of-Thought via Latent-Variable Inference

·
10 authors

Submitted by

akhaliq

Fast View Synthesis of Casual Videos

·
7 authors

Submitted by

akhaliq

Segment Any 3D Gaussians

·
7 authors

Submitted by

akhaliq

Style Aligned Image Generation via Shared Attention

·
4 authors

Submitted by

akhaliq

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

·
6 authors

Submitted by

akhaliq

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

·
8 authors

Submitted by

akhaliq

Axiomatic Preference Modeling for Longform Question Answering

·
5 authors

Submitted by

akhaliq

WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words

·
7 authors

Submitted by

akhaliq

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

·
8 authors

Submitted by

akhaliq

SANeRF-HQ: Segment Anything for NeRF in High Quality

·
4 authors

Submitted by

akhaliq

Generative Powers of Ten

·
9 authors

Submitted by

akhaliq

Rejuvenating image-GPT as Strong Visual Representation Learners

·
6 authors

Submitted by

akhaliq

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

·
9 authors

Submitted by

akhaliq

Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

·
16 authors

Submitted by

akhaliq

TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents

·
6 authors