OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning Paper • 2306.11249 • Published Jun 20, 2023 • 2
SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning Paper • 2211.12509 • Published Nov 22, 2022
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 11 days ago • 36
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 11 days ago • 36
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning Paper • 2306.11249 • Published Jun 20, 2023 • 2
From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing Paper • 2411.11916 • Published Nov 18, 2024 • 3
AutoMix: Unveiling the Power of Mixup for Stronger Classifiers Paper • 2103.13027 • Published Mar 24, 2021 • 1
Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup Paper • 2111.15454 • Published Nov 30, 2021
Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup Paper • 2111.15454 • Published Nov 30, 2021
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published Apr 1 • 93
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published Apr 1 • 93
Cascade-DETR: Delving into High-Quality Universal Object Detection Paper • 2307.11035 • Published Jul 20, 2023
Behavior Contrastive Learning for Unsupervised Skill Discovery Paper • 2305.04477 • Published May 8, 2023
Rethinking Memory and Communication Cost for Efficient Large Language Model Training Paper • 2310.06003 • Published Oct 9, 2023 • 2
SemiReward: A General Reward Model for Semi-supervised Learning Paper • 2310.03013 • Published Oct 4, 2023 • 2
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory Paper • 2404.11163 • Published Apr 17, 2024
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper • 2405.19893 • Published May 30, 2024 • 32
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences Paper • 2406.08128 • Published Jun 12, 2024 • 1