Two months ago, we benchmarked @google’s Veo2 model. It fell short, struggling with style consistency and temporal coherence, trailing behind Runway, Pika, @tencent, and even @alibaba-pai.
That’s changed.
We just wrapped up benchmarking Veo3, and the improvements are substantial. It outperformed every other model by a wide margin across all key metrics. Not just better, dominating across style, coherence, and prompt adherence. It's rare to see such a clear lead in today’s hyper-competitive T2V landscape.
We just added Hidream I1 to our T2I leaderboard (https://www.rapidata.ai/leaderboard/image-models) benchmarked using 195k+ human responses from 38k+ annotators, all collected in under 24 hours.
We just published a dataset using a new (for us) preference modality: direct ranking based on aesthetic preference. We ranked a couple of thousand images from most to least preferred, all sampled from the Open Image Preferences v1 dataset by the amazing @data-is-better-together team.
🔥 Yesterday was a fire day! We dropped two brand-new datasets capturing Human Preferences for text-to-video and text-to-image generations powered by our own crowdsourcing tool!
Whether you're working on model evaluation, alignment, or fine-tuning, this is for you.
🚀 Rapidata: Setting the Standard for Model Evaluation
Rapidata is proud to announce our first independent appearance in academic research, featured in the Lumina-Image 2.0 paper. This marks the beginning of our journey to become the standard for testing text-to-image and generative models. Our expertise in large-scale human annotations allows researchers to refine their models with accurate, real-world feedback.
As we continue to establish ourselves as a key player in model evaluation, we’re here to support researchers with high-quality annotations at scale. Reach out to [email protected] to see how we can help.