Tune-A-Video-library (Tune a video concepts library)

ginipick

posted an update 5 days ago

Post

4001

🎨 FLUX VIDEO Generation - All-in-One AI Image/Video/Audio Generator

🚀 Introduction
FLUX VIDEO Generation is an all-in-one AI creative tool that generates images, videos, and audio from text prompts, powered by NVIDIA H100 GPU for lightning-fast processing!

ginigen/Flux-VIDEO

✨ Key Features
1️⃣ Text → Image → Video 🖼️➡️🎬

Generate high-quality images from Korean/English prompts
Transform still images into natural motion videos
Multiple size presets (Instagram, YouTube, Facebook, etc.)
Demo: 1-4 seconds / Full version: up to 60 seconds

2️⃣ Image Aspect Ratio Change 🎭

Freely adjust image aspect ratios
Expand images with outpainting technology
5 alignment options (Center, Left, Right, Top, Bottom)
Real-time preview functionality

3️⃣ Video + Audio Generation 🎵

Add AI-generated audio to videos
Korean prompt support (auto-translation)
Context-aware sound generation
Powered by MMAudio technology

🛠️ Tech Stack

Image Generation: FLUX, Stable Diffusion XL
Video Generation: TeaCache optimization
Audio Generation: MMAudio (44kHz high-quality)
Outpainting: ControlNet Union
Infrastructure: NVIDIA H100 GPU for ultra-fast generation

💡 How to Use

Select your desired tab
Enter your prompt (Korean/English supported!)
Adjust settings
Click generate button

🎯 Use Cases

📱 Social media content creation
🎥 YouTube Shorts/Reels
📊 Presentation materials
🎨 Creative artwork
🎵 Background sound generation

1 reply

·

prithivMLmods

posted an update 6 days ago

Post

4708

OpenAI, Google, Hugging Face, and Anthropic have released guides and courses on building agents, prompting techniques, scaling AI use cases, and more. Below are 10+ minimalistic guides and courses that may help you in your progress. 📖

⤷ Agents Companion : https://www.kaggle.com/whitepaper-agent-companion
⤷ Building Effective Agents : https://www.anthropic.com/engineering/building-effective-agents
⤷ Guide to building agents by OpenAI : https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf
⤷ Prompt engineering by Google : https://www.kaggle.com/whitepaper-prompt-engineering
⤷ Google: 601 real-world gen AI use cases : https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders
⤷ Prompt engineering by IBM : https://www.ibm.com/think/topics/prompt-engineering-guide
⤷ Prompt Engineering by Anthropic : https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview
⤷ Scaling AI use cases : https://cdn.openai.com/business-guides-and-resources/identifying-and-scaling-ai-use-cases.pdf
⤷ Prompting Guide 101 : https://services.google.com/fh/files/misc/gemini-for-google-workspace-prompting-guide-101.pdf
⤷ AI in the Enterprise by OpenAI : https://cdn.openai.com/business-guides-and-resources/ai-in-the-enterprise.pdf

by HF🤗 :
⤷ AI Agents Course by Huggingface : https://huggingface.co/learn/agents-course/unit0/introduction
⤷ Smol-agents Docs : https://huggingface.co/docs/smolagents/en/tutorials/building_good_agents
⤷ MCP Course by Huggingface : https://huggingface.co/learn/mcp-course/unit0/introduction
⤷ Other Course (LLM, Computer Vision, Deep RL, Audio, Diffusion, Cookbooks, etc..) : https://huggingface.co/learn

2 replies

·

ginipick

posted an update 6 days ago

Post

3561

🎨 AI Hairstyle Changer - Transform with 93 Styles! 💇‍♀️✨

🚀 Introduction
Experience 93 different hairstyles and 29 hair colors in real-time with your uploaded photo!
Transform your look instantly with this AI-powered Gradio web app.

✨ Key Features

📸 Simple 3 Steps
Upload Photo - Upload a front-facing photo
Select Style - Choose from 93 hairstyles
Pick Color - Click your desired color from 29 color palette options

💫 Diverse Hairstyles (93 types)

🎯 Short Cuts: Pixie Cut, Bob, Lob, Crew Cut, Undercut
🌊 Waves: Soft Waves, Hollywood Waves, Finger Waves
🎀 Braids: French Braid, Box Braids, Fishtail Braid, Cornrows
👑 Updos: Chignon, Messy Bun, Top Knot, French Twist
🌈 Special Styles: Space Buns, Dreadlocks, Mohawk, Beehive

🎨 Hair Color Palette (29 colors)

🤎 Natural Colors: Black, Browns, Blonde variations
❤️ Red Tones: Red, Auburn, Copper, Burgundy
💜 Fashion Colors: Blue, Purple, Pink, Green, Rose Gold
⚪ Cool Tones: Silver, Ash Blonde, Titanium

🌟 Key Advantages

⚡ Fast Processing: Get results in just 10-30 seconds
🎯 High Accuracy: Natural-looking transformations with AI technology
💎 Professional Quality: High-resolution output suitable for social media
🔄 Unlimited Trials: Try as many combinations as you want
📱 User-Friendly: Intuitive interface with visual color palette

💡 Perfect For

💈 Salon Consultations: Show clients potential new looks before cutting
🛍️ Personal Styling: Experiment before making a big change
🎭 Entertainment: Fun transformations for social media content
🎬 Creative Projects: Character design and visualization
👗 Fashion Industry: Match hairstyles with outfits and makeup
📸 Photography: Pre-visualization for photoshoots

LINK: ginipick/Change-Hair

5 replies

·

prithivMLmods

posted an update 7 days ago

Post

2137

Just made a demo for Cosmos-Reason1, a physical AI model that understands physical common sense and generates appropriate embodied decisions in natural language through long chain-of-thought reasoning. Also added video understanding support to it. 🤗🚀

✦ Try the demo here : prithivMLmods/DocScope-R1

⤷ Cosmos-Reason1-7B : nvidia/Cosmos-Reason1-7B
⤷ docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
⤷ Captioner-Relaxed : Ertugrul/Qwen2.5-VL-7B-Captioner-Relaxed

⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

⤷ GitHub :
• https://github.com/PRITHIVSAKTHIUR/Cosmos-x-DocScope
• https://github.com/PRITHIVSAKTHIUR/Nvidia-Cosmos-Reason1-Demo.

To know more about it, visit the model card of the respective model. !!

AtAndDev

posted an update 8 days ago

Post

2663

deepseek-ai/DeepSeek-R1-0528

This is the end

1 reply

·

1024m

authored a paper 11 days ago

Uncovering Cultural Representation Disparities in Vision-Language Models

Paper • 2505.14729 • Published 17 days ago • 1

prithivMLmods

posted an update 16 days ago

Post

2281

Got access to Google's all-new Gemini Diffusion a state-of-the-art text diffusion model. It delivers the performance of Gemini 2.0 Flash-Lite at 5x the speed, generating over 1000 tokens in a fraction of a second and producing impressive results. Below are some initial outputs generated using the model. ♊🔥

Gemini Diffusion Playground ✦ : https://deepmind.google.com/frontiers/gemini-diffusion

Get Access Here : https://docs.google.com/forms/d/1aLm6J13tAkq4v4qwGR3z35W2qWy7mHiiA0wGEpecooo/viewform?edit_requested=true

🔗 To know more, visit: https://deepmind.google/models/gemini-diffusion/

1 reply

·

prithivMLmods

posted an update 17 days ago

Post

2289

The more optimized explicit content filters with lightweight 𝙜𝙪𝙖𝙧𝙙 models trained based on siglip2 patch16 512 and vit patch16 224 for illustration and explicit content classification for content moderation in social media, forums, and parental controls for safer browsing environments. this version fixes the issues in the previous release, which lacked sufficient resources. 🚀

⤷ Models :
→ siglip2 mini explicit content : prithivMLmods/siglip2-mini-explicit-content [recommended]
→ vit mini explicit content : prithivMLmods/vit-mini-explicit-content

⤷ Building image safety-guard models :

strangerguardhf

⤷ Datasets :
→ nsfw multidomain classification : strangerguardhf/NSFW-MultiDomain-Classification
→ nsfw multidomain classification v2.0 : strangerguardhf/NSFW-MultiDomain-Classification-v2.0

⤷ Collection :
→ Updated Versions [05192025] : prithivMLmods/explicit-content-filters-682aaa4733e378561925ca2b
→ Previous Versions : prithivMLmods/siglip2-content-filters-042025-final-680fe4aa1a9d589bf2c915ff

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.

1 reply

·

ginipick

posted an update 17 days ago

Post

3778

AI BOOK MAKER 📚✨
Transform your text and PDF into a beautiful AI-powered intelligent Flipbook with magic 🪄

ginipick/AI-BOOK

Introduction 🌟
AI BOOK MAKER is a revolutionary platform that converts text and PDF files into intelligent AI books. With just a single file upload, our automatic RAG (Retrieval-Augmented Generation) system activates an AI chatbot that perfectly comprehends your content, delivering a next-generation digital book experience that combines interactive flipbooks with conversational intelligence! 📖✨
Groundbreaking Core Features 💎

One-Click RAG System 🔄: Automatic knowledge base creation and AI conversation engine activation with just one text or PDF upload
Industry-Leading Flip Effects 📄➡️📄: Exclusive AI-driven page transition technology for an immersive experience beyond physical books
Perfect Cross-Platform Support 📱: Intelligent responsive design providing optimized experiences on any device
Automatic Unique URL Generation 🔗: Exclusive system creating personalized links for instant sharing with friends, family, and colleagues
AI Auto-Summary Engine 🤖: Intelligent summarization and insight extraction features that instantly grasp the essence of your content
Ultra-Intelligent AI Chatbot 💬: Personalized knowledge assistant to ask questions and get answers about book content

Game-Changer For People Who 👍

📝 Authors and creators wanting to share their knowledge and content as AI-powered interactive books
🎓 Educators and students looking to transform research materials and learning content into smart, conversational flipbooks
👨‍💼 Professionals seeking to upgrade business documents into intelligent books shareable with clients and team members
📚 Anyone wanting to share valuable documents with their network while exploring new experiences with AI assistance

Start the Magic in 3 Seconds 🛠️

Single Upload 📤
Ultra-Fast AI Conversion ⚡
Custom URL Acquisition 👀
Explore with AI 💬

ginipick

posted an update 20 days ago

Post

3367

🌟 Introducing Ilúvatar: Creative Design & Invention AI 🌟

Link: ginipick/IDEA-DESIGN

Hello, AI creators! 👋
Today I'm introducing Ilúvatar, an amazing tool that automatically generates innovative design and invention ideas.

✨ Key Features

🧠 AI-Powered Idea Generation: Creates detailed design/invention ideas from simple prompts
🔍 Web Search Integration: Incorporates real-time information to reflect latest trends
📊 Kaggle Dataset Analysis: Provides data-driven insights
🖼️ Automatic Image Generation: Creates image prompts visualizing your ideas
📁 File Upload Support: Analyzes reference materials (text, CSV, PDF)
📈 Business Frameworks: Includes SWOT, Porter's 5 Forces, BCG Matrix analyses
🌏 Multilingual Support: Available in both English and Korean

🎯 Perfect For

💼 Product Designers/Developers: When you need fresh product concepts
🔬 Researchers/Inventors: When you need innovative idea inspiration
📝 Planners/Marketers: When you need differentiated business strategies
🎓 Students/Educators: For creative thinking and problem-solving education

🚀 Start Creating Now!
Utilizing 24 categories and ~1,100 items as design SEEDS, the system generates combinations across 2-6 depth levels, creating up to 1,100 trillion design variables. A "water-air transitional device" might combine structural self-reorganization, material transformation, biomimetic movement, and propulsion optimization.
The LLM analyzes correlations between user queries and design combinations, identifying innovative elements like hybrid propulsion systems inspired by nature.
By integrating data from Kaggle datasets, web searches, and research, the system prioritizes groundbreaking combinations such as "graphene morphing wings + AI fluid dynamics + quantum dot solar cells" with feasibility assessments.

prithivMLmods

posted an update 21 days ago

Post

2696

Models for detecting images generated by diffusion models (Flux.1, SDXL, ..) are trained or fine-tuned using image classification models for content moderation. These models use datasets available on the Hub. For identifying AI-generated images or moderating visual content, the recommended model is OpenSDI-Flux.1-SigLIP2.😺🧨

Models : prithivMLmods/OpenSDI-Flux.1-SigLIP2 [Best approach for AI [Diffusion Generated] vs. real image classification] prithivMLmods/OpenSDI-SD2.1-SigLIP2 prithivMLmods/OpenSDI-SD3-SigLIP2 prithivMLmods/OpenSDI-SD1.5-SigLIP2 prithivMLmods/OpenSDI-SDXL-SigLIP2

Datasets : nebula/OpenSDI_test madebyollin/megalith-10m

Collection : prithivMLmods/opensdi-diffusion-generated-image-classification-682488a3a3e5be7083db3383

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.

prithivMLmods

posted an update 22 days ago

Post

2026

Dropping some image classification models for content moderation and classifiers trained with datasets available on the Hub. All are fine-tuned on the siglip2 backbone, (competitions AIOrNot, Imagenette, and Driver-Drowsiness). Models and datasets are listed below:

🤗Models :
AI or Not : prithivMLmods/AIorNot-SigLIP2
Driver Drowsiness Detection : prithivMLmods/DOZE-GUARD-RLDD
Subset 10 ImageNet : prithivMLmods/IMAGENETTE

🥊Datasets :
+ competitions/aiornot
+ akahana/Driver-Drowsiness-Dataset
+ frgfm/imagenette

🔗Collection :
[The previous collection of models is also listed in the same collection, so you can find more models focused on image classification tasks.]

- prithivMLmods/multiclass-image-classification-05142025-68234c8010a9350a4d6739b5

Find a collections inside the collection.🤪👆

To know more about it, visit the model card of the respective model.

ginipick

posted an update 25 days ago

Post

5458

# 🌟 3D Model to Video: Easy GLB Conversion Tool 🌟

demo link: ginigen/3D-VIDEO

Hello there! Would you like to transform your 3D models into stunning animations? This space can help you! ✨

## 🔍 What Can It Do?

This tool converts your uploaded GLB model into:
1. 🎮 A transformed GLB file
2. 🎬 An animated GIF preview
3. 📋 A metadata JSON file

## ✅ Key Features

* 🖥️ Works in headless server environments (EGL + pyglet-headless → pyrender fallback)
* 🔍 Objects in GIFs appear 3x larger (global scale ×3)
* 🎨 Clean interface with pastel background

## 🎮 Animation Types

* 🔄 Rotate - Object rotates around the Y-axis
* ⬆️ Float - Object moves smoothly up and down
* 💥 Explode - Object moves sideways
* 🧩 Assemble - Object returns to its original position
* 💓 Pulse - Object changes in size
* 🔄 Swing - Object swings around the Z-axis

## 🛠️ How to Use

1. Upload your GLB model 📤
2. Select your desired animation type 🎬
3. Adjust the duration and FPS ⏱️
4. Click the "Generate Animation" button ▶️
5. Download your results 📥

## 💻 Technical Details

* Rendering system using trimesh and pyrender
* Automatic fallback method for rendering failures to ensure stability
* GIF generation supporting up to 60 frames

Breathe life into your static 3D models with this tool! 🚀 If you have any questions or feedback, please let us know. Happy 3D modeling! ✨

prithivMLmods

posted an update 26 days ago

Post

3523

Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasets—along with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:

🤗Models & Datasets :

Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
⎙ prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
⎙ prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
⎙ DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
⎙ SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
⎙ Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
⎙ jonathan-roberts1/GID

🤗Collection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c

To know more about it, visit the model card of the respective model.

Nymbo

posted an update 27 days ago

Post

2193

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

1 reply

·

prithivMLmods

posted an update 30 days ago

Post

3261

Well, here’s the updated version with the 20,000+ entry sampled dataset for Watermark Filter Content Moderation models incl. [Food25, Weather, Watermark, Marathi/Hindi Sign Language Detection], post-trained from the base models: sigLip2 patch16 224 — now with mixed aspect ratios for better performance and reduced misclassification. 🔥

Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection

Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset

Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b

ginipick

posted an update about 1 month ago

Post

5204

🔮 Mistral Perflexity AI - Local LLM Space with Web Search Capabilities 🌐
Hello AI enthusiasts! Today I'm excited to introduce my special Hugging Face space! 🚀

ginigen/Mistral-Perflexity

✨ Key Features

Powerful Model: Using Private-BitSix-Mistral-Small-3.1-24B-Instruct-2503, optimized through 6-bit quantization to run smoothly on local 4090 GPUs! 💪
Web Search Integration: Leveraging the Brave Search API to provide real-time web search results for user queries! 🔍
Customizable Responses: Shape AI personality and response format through system messages ⚙️
Multilingual Support: Perfect handling of both English and Korean! 🇺🇸🇰🇷

🛠️ Technical Highlights

GGUF Format: Optimized quantized model with excellent memory efficiency
Flash Attention: Applied optimization technology for faster inference speeds
8K Context Window: Capable of handling lengthy conversations and complex queries
Streaming Responses: Watch text being generated in real-time

💡 Use Cases

Complex Q&A requiring real-time information
Programming assistance and code generation
Multilingual content creation and translation
Summarization and explanation of learning materials

🔧 Customization
Adjust various parameters like Temperature, Top-p, Top-k, and repetition penalty to control response creativity and accuracy. Lower temperature (0.1-0.5) produces more deterministic responses, while higher values (0.7-1.0) generate more creative outputs!

🌟 Try It Yourself!
This space is available for anyone to use for free. Experience the power of a robust local LLM combined with web search capabilities! Your feedback is always welcome! 😊

prithivMLmods

posted an update about 1 month ago

Post

1176

The new versions of Midjourney Mix adapters have been dropped in stranger zone hf. These adapters excel in studio lighting portraits and painterly styles, trained using the style of strangerzonehf/Flux-Midjourney-Mix2-LoRA. They leverage 24-bit colored synthetic images generated form midjourney v6 to achieve high-quality image reproducibility and support adaptable aspect ratios, using Flux.1 as the base model. 🥳

Models [ ⌗ ]

> Flux-Midjourney-Painterly-LoRA : strangerzonehf/Flux-Midjourney-Painterly-LoRA
> Flux-Midjourney-Studio-LoRA : strangerzonehf/Flux-Midjourney-Studio-LoRA

> Collection : strangerzonehf/midjourney-mix-3-ft-flux1-dev-68165d58a2a08025852d63f3

> Space : prithivMLmods/FLUX-LoRA-DLC2

The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.

ginipick

posted an update about 1 month ago

Post

3195

🎨 Renoir Studio: Impressionist Masterpieces Reborn Through AI ✨

🌟 Experience Renoir's Magical Brushstrokes with AI!

🔗 Try it now: ginigen/flux-lora-renoir
🔗 Model page: openfree/pierre-auguste-renoir
🔗 Collection: openfree/painting-art-ai-681453484ec15ef5978bbeb1

Hello, AI art enthusiasts! 💖
Today I'm introducing a special model - Pierre-Auguste Renoir Studio. Create your own beautiful artwork in the style of the 19th century French Impressionist master! 🖼️
✨ Why Renoir's Style?
Renoir is famous for his luminous colors and soft brushstrokes. His works feature:

🌞 Warm sunshine and dancing light
👨‍👩‍👧‍👦 The beauty of everyday life and joyful moments
🌸 Vibrant nature and portraits of beautiful women
🎭 Lively Parisian social gatherings and outdoor scenes

🔬 Technical Features
This model was developed as a flux-based learning model trained on a curated collection of high-resolution masterpieces from renowned global artists. The LoRA fine-tuning process leveraged exceptional quality open-access imagery released by prestigious institutions including the Art Institute of Chicago. The resulting model demonstrates remarkable capability in capturing the nuanced artistic techniques and stylistic elements across diverse historical art movements! 🧠💫
🚀 How to Use

Describe your desired scene in the prompt box
Add the "renoir" keyword at the end (this is the trigger keyword!)
Click the 'Generate' button
Enjoy your ideas reborn in Renoir's style!

💡 Recommended Prompt Examples

"Elegant ladies enjoying a picnic in a sunlit garden, wearing pastel dresses and hats renoir"
"People boating by a riverbank, light reflecting on water, warmth of summer renoir"
"Paris cafe terrace, people chatting over coffee, evening sunset renoir"

🌈 Now It's Your Turn!
#AI#Renoir #ArtificialIntelligence#HuggingFace #FLUX #LoRA

Nymbo

posted an update about 1 month ago

Post

2024

PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.

Tune a video concepts library

AI & ML interests

Recent Activity

Tune-A-Video-library's activity

Uncovering Cultural Representation Disparities in Vision-Language Models

AI & ML interests

Recent Activity

Team members 85

Tune-A-Video-library's activity