1 3 17

Justus Tobias PRO

justus-tobias

https://justus-tobias.de

j-tobias

AI & ML interests

Multimodal Learning, Representation Learning, Audio Processing

Recent Activity

liked a model about 19 hours ago

Qwen/Qwen2.5-VL-7B-Instruct

liked a model 13 days ago

THUDM/CogVideoX1.5-5B-I2V

updated a Space about 1 month ago

justus-tobias/Heartbeat

View all activity

Organizations

None yet

justus-tobias's activity

liked a model about 19 hours ago

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 1 day ago • 378k • 340

liked a model 13 days ago

THUDM/CogVideoX1.5-5B-I2V

Image-to-Video • Updated Nov 20, 2024 • 20.7k • 91

updated a Space about 1 month ago

Heartbeat

💜

upvoted a paper about 2 months ago

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Paper • 2411.02256 • Published Nov 4, 2024 • 1

liked a model 2 months ago

tencent/HunyuanVideo

Text-to-Video • Updated 17 days ago • 7.23k • • 1.59k

upvoted a paper 2 months ago

AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset

Paper • 2311.15308 • Published Nov 26, 2023 • 1

liked a Space 2 months ago

Gradio Demo Space creation helper V2

🐶

Generate Gradio demo files for Hugging Face model repos

updated a Space 4 months ago

Moshi

💨

Create interactive spoken dialogue using audio input

upvoted a paper 4 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106

liked a Space 5 months ago

609

Open ASR Leaderboard

🏆

Request evaluation results for a speech model

liked a Space 6 months ago

951

Seamless M4T

📞

updated a dataset 6 months ago

justus-tobias/TestDataset

Updated Aug 15, 2024 • 2

liked a Space 6 months ago

gradio_pdf V0.10.0

🚀

Ask questions about PDF documents

liked a model 6 months ago

facebook/wav2vec2-base-960h

Automatic Speech Recognition • Updated Nov 14, 2022 • 2.36M • 315

liked 2 datasets 6 months ago

openslr/librispeech_asr

Updated Aug 14, 2024 • 14.2k • 137

MLCommons/peoples_speech

Viewer • Updated Nov 20, 2024 • 8.05M • 22.3k • 94

liked a Space 7 months ago

299

AudioLDM2 Text2Audio Text2Music Generation

🔊

Generate a video waveform from text-based audio descriptions

liked a Space 8 months ago

127

Exbert

🌍

Explore BERT model interactions

liked 2 Spaces 9 months ago

599

StoryDiffusion

👁

Generate images from text prompts and reference images

9.32k

AI Comic Factory

👩

Create your own AI comic with a single prompt