🎨 FLUX VIDEO Generation - All-in-One AI Image/Video/Audio Generator
🚀 Introduction FLUX VIDEO Generation is an all-in-one AI creative tool that generates images, videos, and audio from text prompts, powered by NVIDIA H100 GPU for lightning-fast processing!
Generate high-quality images from Korean/English prompts Transform still images into natural motion videos Multiple size presets (Instagram, YouTube, Facebook, etc.) Demo: 1-4 seconds / Full version: up to 60 seconds
Hey! I built RAG MCP Server Space, a simple Gradio MCP server for RAG systems that allows you to search relevant results without passing huge contexts to your LLM.
You can use this space to integrate with your agents and improve the efficiency of your search results. Feel free to try it out and let me know if you have any feedback or questions!
OpenAI, Google, Hugging Face, and Anthropic have released guides and courses on building agents, prompting techniques, scaling AI use cases, and more. Below are 10+ minimalistic guides and courses that may help you in your progress. 📖
🧠 We just implemented Andrej Karpathy's "third paradigm" for LLM learning!
System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts.
🚀 How it works: Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates.
The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently.
✨ Key benefits: 🔄 Cumulative learning over time 📖 Transparent, inspectable strategies 🔌 Works with any OpenAI-compatible API ⚡ Simple integration: just add "spl-" prefix to your model
Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them!
This feels like a genuine step toward AI that learns from experience while staying completely interpretable.
1. Agentset MCP -> https://github.com/agentset-ai/mcp-server For efficient and quick building of intelligent, doc-based apps using open-source Agentset platform for RAG
2. GitHub MCP Server -> https://github.com/github/github-mcp-server Integrates GitHub APIs into your workflow, allowing to build AI tools and apps that interact with GitHub's ecosystem
5. Safe Local Python Executor -> https://github.com/maxim-saplin/mcp_safe_local_python_executor A lightweight tool for running LLM-generated Python code locally, using Hugging Face’s LocalPythonExecutor (from smolagents framework) and exposing it via MCP for AI assistant integration
7. Basic Memory -> https://memory.basicmachines.co/docs/introduction This knowledge management system connects to LLMs and lets you build a persistent semantic graph from AI conversations with AI agents
🎯 Core Features 15 Expert Theories for professional brand naming Bilingual Support Korean/English for global brands Unified Evaluation System creativity/memorability/relevance scores Real-time Visualization theory-specific custom designs
🔬 Applied Theories Cognitive Theories (4) 🟦 Square Theory - Semantic square structure with 4-word relationships 🔊 Sound Symbolism - Psychological connections between phonemes and meaning 🧠 Cognitive Load - Minimized processing for instant recognition 👁️ Gestalt Theory - Perceptual principles where whole exceeds parts
Creative Theories (3) 🔀 Conceptual Blending - Merging concepts to create new meanings 🔧 SCAMPER Method - 7 creative transformation techniques 🌿 Biomimicry - Nature-inspired wisdom from 3.8 billion years of evolution
Cultural Theories (3) 🎭 Jung's Archetype - 12 universal archetypes for emotional connection 🌐 Linguistic Relativity - Cross-cultural thinking patterns consideration 🧬 Memetics - Cultural transmission and evolutionary potential
Differentiation Theories (3) ⚡ Von Restorff Effect - Uniqueness for 30x better recall 🎨 Color Psychology - Emotional associations and color meanings 🌍 Network Effects - Value maximization through network structures
💫 Special Features Each theory provides unique visualizations and customized analysis:
Square Theory → 4-corner relationship diagram Blending → Concept fusion flowchart Color → Interactive color palette display Theory-specific insights for each approach
🎵 Dream come true for content creators! TIGER AI can extract voice, effects & music from ANY audio file 🤯 This lightweight model uses frequency band-split technology to separate speech like magic. Kudos to @fffiloni for the amazing demo! fffiloni/TIGER-audio-extraction
It's just become easier to share your apps on the biggest AI app store (aka HF spaces) for unlimited storage, more visibility and community interactions.
Just pick a React, Svelte, or Vue template when you create your space or add app_build_command: npm run build in your README's YAML and app_file: build/index.html in your README's YAML block.
Just completed the AI Agents course and wow, that capstone project really makes you understand how to build agents that can handle real-world complexity!
The final project uses the GAIA dataset - your agent has to solve tasks like analyzing Excel files, processing audio recordings, answering questions about YouTube videos, and diving into research papers. This isn't toy examples, it's the messy, multimodal stuff agents need to handle in practice.
Whether you’re just getting started with agents or want to go deeper with tools like LangChain, LlamaIndex, and SmolAgents, this course has tons of useful stuff. A few key insights: - Code agents are incredibly versatile once you get the architecture right - The sweet spot is finding the right balance of guidance vs autonomy for each use case - Once the logic clicks, the possibilities really are endless - it's like letting LLMs break free from the chatbox
The course is free and the certification deadline is July 1st, 2025.
Playing with Veo3 this morning. Share your prompt if you want me to create videos for you (bonus point if they funnily reference HF/open-source). These videos are "a cat on the moon rapping "I love Hugging Face""!
🚀 For those who interested in minimalistic integration of LLMs inferece with predefined reasoning shema, excited to share the latest bulk chain 1.1.0. It represents a no-string solution for deploying your LLM for efficient inference over data iterators. ✨ Key Features: - Full async inference support + Including streaming mode for real-time output - simplified inference API 🔗 Check out the repo: https://github.com/nicolay-r/bulk-chain 💡 Special thanks to @RicardoLee for his work on effective async LLaMA-3 deployment that helped shape this release: https://github.com/RicardoLeeV587/Llama3-FastInference
🚀 ZeroGPU medium size is now available as a power-user feature
Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for: - 💰 size-based quotas / pricing (medium will offer significantly more usage than large) - 🦣 the upcoming xlarge size (141GB VRAM)
You can as of now control GPU size via a Space variable. Accepted values: - auto (future default) - medium - large (current default)
The auto mode checks total CUDA tensor size during startup: - More than 30GB → large - Otherwise → medium
📄 arxiv Paper: In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer (2504.20690)
🔥 Why it’s cool: - Achieves high-quality, multi-task image editing. - Uses only 1% of the training parameters and 0.1% of the training data compared to existing methods — extremely efficient - Beats several commercial models on background preservation, ID control, and consistency - Open-source, low-cost, faster, and stronger — think of it as the “DeepSeek of image editing” 👀
We also implemented a Gradio demo app, available directly in our GitHub repo! And we made a flashy demo video — happy to send it your way!