vrgamedevgirl84
/

Wan14BT2VFusioniX

Text-to-Video

diffusion

merged-model

video-generation

wan2.1

Model card Files Files and versions Community

vrgamedevgirl84 commited on Jun 11

Commit

6e7ed8c

verified ·

1 Parent(s): ac36e47

Update README.md

Browse files

Files changed (1) hide show

README.md +99 -47

README.md CHANGED Viewed

@@ -11,91 +11,143 @@ widget:
     *(🔸 Before vs After — Left: Wan2.1 | Right: Merged model Wan14BT2V_MasterModel)*
   output:
     url: videos/AnimateDiff_00001.mp4
 base_model:
 - Wan-AI/Wan2.1-T2V-14B
 license: apache-2.0
 ---
-# 🎥 Wan2.1_14B_T2V-FusionX - Formerly named MasterModel, which served as a placeholder. Its the exact same model, just a new name. This inlucdes the fp8 and fp16 version.
-A powerful merged **text-to-video model** based on the original [WAN 2.1 T2V](https://huggingface.co/Wan-AI/Wan2.1-T2V-14B) model, enhanced using multiple open-source components and LoRAs to boost motion realism, temporal consistency, and expressive detail.
-multiple open-source models and LoRAs to boost temporal quality, expressiveness, and motion realism.
 ---
-## 🧠 Model Blend Includes:
-- 🔗 [AccVideo](https://github.com/aejion/AccVideo)
-- 🔗 [MoviiGen1.1](https://huggingface.co/ZuluVision/MoviiGen1.1)
-- 🔗 [CausVid](https://github.com/tianweiy/CausVid)
-- 🔗 [MPS Rewards LoRA](https://huggingface.co/alibaba-pai/Wan2.1-Fun-Reward-LoRAs)
-- ✨ Custom detail-enhancer LoRAs I created specifically for this merge
-- Only 8-10 steps are needed to get great results!
--
-All are under **Apache 2.0** or **MIT** licenses and fully permitted for merge and reuse.
 ---
-## 🖼️ Example Prompt
-> **Prompt:**
-> Tight close-up of her smiling lips and sparkling eyes, catching golden hour sunlight. She wears a white sundress with floral prints and a wide-brimmed straw hat. Camera pulls back in a dolly motion, revealing her twirling under a cherry blossom tree. Petals flutter in the air, casting playful shadows. Soft lens flares enhance the euphoric, dreamlike vibe.
-> **Negative Prompt (CN):**
-> 色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
 ---
-## 🧰 How to Use (ComfyUI)
-1. Download the `.safetensors` file from the [Files tab](https://huggingface.co/vrgamedevgirl84/Wan14BT2V_MasterModel/tree/main).
-2. Place it into your `ComfyUI/models/diffusion_models/` folder.
-3. Restart ComfyUI.
-4. Use the **Checkpoint Loader** node and connect it to your text-to-video workflow.
-5. Since CausVid is merged into this model, you only need no more than 10 steps to get great results.
 ---
-## 📦 Downloads
-Weights available in `.safetensors` format
-👉 [Download here](https://huggingface.co/vrgamedevgirl84/Wan14BT2V_MasterModel/tree/main)
 ---
-## ⚠️ License Notice
-This merged model combines components licensed under **Apache 2.0** and **MIT** — both of which are permissive open-source licenses.
-### ✅ You are allowed to:
-- Use, modify, and redistribute the model (including commercial use)
-- Integrate it into your own projects or tools
-### 📌 You must:
-- Include original license notices if you redistribute the model
-- Avoid implying endorsement or affiliation with the original authors (as required by Apache 2.0)
-### 🔄 Output Content:
-- Generated videos **are not licensed** by the model’s open-source license
-- If any merged model relied on datasets with restrictions, those rules may still apply to the outputs
-This model is intended for **research and creative exploration**, not guaranteed for production use without further validation.
 ---
-## 🙌 Thanks To
-- Alibaba-PAI
-- aejion
-- ZuluVision
-- Tianwei Yin (CausVid)
-- Kaji
-Big thanks to all original devs — this merge wouldn't be possible without your amazing work.
 ---

     *(🔸 Before vs After — Left: Wan2.1 | Right: Merged model Wan14BT2V_MasterModel)*
   output:
     url: videos/AnimateDiff_00001.mp4
 base_model:
 - Wan-AI/Wan2.1-T2V-14B
 license: apache-2.0
 ---
+# Wan2.1_14B_FusionX
+Merged models for faster, richer motion & detail — high performance even at just 8 steps.
+> 📌 Important: Please read the full description. Small setting changes can drastically affect results. I've tested and documented better settings below — don't skip it!
+---
+## 📂 Workflows
+Workflows can be found **[HERE](#)** (WIP — more coming soon)
+---
+## 🚀 Overview
+A powerful text-to-video model built on top of **WAN 2.1 14B**, merged with several research-grade models to boost:
+- Motion quality
+- Scene consistency
+- Visual detail
+Comparable with closed-source solutions, but open and optimized for **ComfyUI** workflows.
 ---
+## 💡 Inside the Fusion
+This model includes the following merged components:
+- **CausVid** – Causal motion modeling for better flow and dynamics
+- **AccVideo** – Better temporal alignment and speed boost
+- **MoviiGen1.1** – Cinematic smoothness and lighting
+- **MPS Reward LoRA** – Tuned for motion and detail
+- **Custom LoRAs** – For texture, clarity, and facial enhancements
+All merged models use permissive open licenses (Apache 2.0 / MIT).
 ---
+## 🔧 Usage Details
+### Text-to-Video
+- **CGF**: Must be set to `1`
+- **Shift**:
+  - `1024x576`: Start at `1`
+  - `1080x720`: Start at `2`
+  - For realism → lower values
+  - For stylized → test `3–9`
+- **Scheduler**:
+  - Recommended: `uni_pc`
+  - Alternative: `flowmatch_causvid` (better for some details)
+### Image-to-Video
+- **CGF**: `1`
+- **Shift**: `2` works best in most cases
+- **Scheduler**:
+  - Recommended: `dmp++_sde/beta`
+- To boost motion and reduce slow-mo effect:
+  - Frame count: `121`
+  - FPS: `24`
 ---
+## 🛠 Technical Notes
+- Works in as few as **6 steps**
+- Best quality at **8–10 steps**
+- Drop-in replacement for `Wan2.1-T2V-14B`
+- Up to **50% faster rendering**, especially with **SageAttn**
+- Works natively and with **Kaji Wan Wrapper**
+  [Wrapper GitHub](https://github.com/kijai/ComfyUI-WanVideoWrapper)
+- Do **not** re-add merged LoRAs (CausVid, AccVideo, MPS)
+- Feel free to add **other LoRAs** for style/variation
+- Native WAN workflows also supported (slightly slower)
 ---
+## 🧪 Performance Tips
+- RTX 5090 → ~138 sec/video at 1024x576 / 81 frames
+- If VRAM is limited:
+  - Enable block swapping
+  - Start with `5` blocks and adjust as needed
+- Use **SageAttn** for ~30% speedup (wrapper only)
+- Do **not** use `teacache`
+- "Enhance a video" (tested): Adds vibrance (try values 2–4)
+- "SLG" not tested — feel free to explore
 ---
+## 🧠 Prompt Help
+Want better cinematic prompts? Try the **WAN Cinematic Video Prompt Generator GPT** — it adds visual richness and makes a big difference in quality. [Download Here](https://chatgpt.com/g/g-67c3a6d6d19c81919b3247d2bfd01d0b-wan-cinematic-video-prompt-generator)
+---
+## 📣 Join The Community
+We’re building a friendly space to chat, share outputs, and get help.
+- Motion LoRAs coming soon
+- Tips, updates, and support from other users
+👉 [Join the Discord](https://discord.com/invite/hxPmmXmRW3)
 ---
+## ⚖️ License
+Merged under permissive licenses:
+- Apache 2.0 / MIT
+- You **can** use, modify, and redistribute
+- You **must** retain original license info
+- Outputs are not necessarily licensed — do your due diligence
+This model is for **research, education, and personal use** only. Commercial use is your own responsibility. Please consult a legal advisor before monetizing outputs.
 ---
+## 🙏 Credits
+- WAN Team (base model)
+- aejion (AccVideo)
+- Tianwei Yin (CausVid)
+- ZuluVision (MoviiGen)
+- Alibaba PAI (MPS LoRA)
+- Kijai (ComfyUI Wrapper)
+And thanks to the open-source community!
+---