Decaf: Monocular Deformation Capture for Face and Hand Interactions Paper • 2309.16670 • Published Sep 28, 2023 • 5
Three Pillars improving Vision Foundation Model Distillation for Lidar Paper • 2310.17504 • Published Oct 26, 2023
OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning Paper • 2012.11552 • Published Dec 21, 2020
Localizing Objects with Self-Supervised Transformers and no Labels Paper • 2109.14279 • Published Sep 29, 2021
Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation Paper • 2203.11160 • Published Mar 21, 2022
ROAM: a Rich Object Appearance Model with Application to Rotoscoping Paper • 1612.01495 • Published Dec 5, 2016
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery Paper • 2407.14499 • Published Jul 19, 2024
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation Paper • 2402.03119 • Published Feb 5, 2024
Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia Paper • 2410.05270 • Published Oct 7, 2024
Moshi: a speech-text foundation model for real-time dialogue Paper • 2410.00037 • Published Sep 17, 2024 • 4