Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Paper • 2506.07177 • Published Jun 8 • 22
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization Paper • 2202.11453 • Published Feb 23, 2022
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization Paper • 2504.08641 • Published Apr 11 • 7
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning Paper • 2506.03525 • Published Jun 4 • 6
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance Paper • 2505.21876 • Published May 28 • 9
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Paper • 2503.01820 • Published Mar 3 • 2
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective Paper • 2502.14296 • Published Feb 20 • 46
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Paper • 2411.16657 • Published Nov 25, 2024 • 20
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Paper • 2411.15115 • Published Nov 22, 2024 • 9
Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection Paper • 2410.10636 • Published Oct 14, 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation Paper • 2410.12761 • Published Oct 16, 2024
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models Paper • 2304.01515 • Published Apr 4, 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models Paper • 2310.00754 • Published Oct 1, 2023
On the Soft-Subnetwork for Few-shot Class Incremental Learning Paper • 2209.07529 • Published Sep 15, 2022 • 1
Forget-free Continual Learning with Soft-Winning SubNetworks Paper • 2303.14962 • Published Mar 27, 2023 • 1
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences Paper • 2401.10529 • Published Jan 19, 2024 • 1
EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents Paper • 2403.12014 • Published Mar 18, 2024
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models Paper • 2310.02998 • Published Oct 4, 2023 • 1