Qwen2-Audio Collection Audio-language model series based on Qwen2 • 4 items • Updated 4 days ago • 39
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 67 items • Updated Jul 3 • 62
MS MARCO Mined Triplets Collection These datasets contain MS MARCO Triplets gathered by mining hard negatives using various models. Each dataset has various subsets. • 14 items • Updated May 21 • 7
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 185
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 64