---
license: apache-2.0
base_model:
- Wan-AI/Wan2.1-T2V-1.3B
---

I have added some experimental versions of the model Wan 2.1 v1.3b. These are different levels of distilled + hires + refined editions. 
Normally, the medium models should be the best, but this is experimental, and I haven't had time to test every situation. 
However, the first results look promising.

You can generate videos with 4, 5, 6, 8, 10, or more steps. 
This is my first version, and if I notice any issues, I will try to fix them later. 
From my tests, it works well, but as I mentioned before, I haven't tested every possible situation.

You can find a workflow for ComfyUI to test the models if you're interested.

IMPORTANT:
If you use a higher step count, try using another sampler like euler and try different scheduler too.
You may also need to increase the CFG with a higher step count.
Generally, the animation is better with more steps, but it also takes more time. 
With a lower step count, the animation can be a bit more random.
If the video colors are too intense, try reducing the CFG or using a lower version of the model.
With example of 20 steps or more, it is better to use the lower model version and adjust the CFG to correct the color.
Remember, these models are modified and do not behave like the original ones.

If anyone wants to add sounds or voices, come back a bit later, I will provide the workflows to do so.

Video Exemple:
https://youtu.be/kfokkXEGByU

Notice: I see that this works pretty well up close, but there are some blur issues with the background. 
I'll try to fix that by the end of the week, so I'll definitely need to update some models—or maybe reduce everything to just three or four models. 
This is an experimental test version, and it provides a big boost, but after some testing, I noticed that the background is blurrier than the original, and I think I know why. 
However, to fix it, I'll have to redo the models. Sorry for the inconvenience! 
I hope people can still have fun with this test version in the meantime.

This weekend, I’m going to create some workflows to use the original model and achieve good quality. 
I’ll also add workflows for sound and voices. 
I will likely fix and update the models as well. 
So if there’s a model you like among those I’ve uploaded, make sure to save it, as I’ll be deleting them over the weekend to update them with new versions along with the other workflows. 
I’ll have a bit more time to fix and test everything further to refine it as much as possible. 
Until then, have fun with these experimental versions!

---
LoRAs for Wan-AI/Wan2.1-T2V-1.3B

The LoRA comes from the DiffSynth-Studio team.

https://github.com/modelscope/DiffSynth-Studio

I have only converted the LoRAs to make them compatible with ComfyUI.
The original LoRA model comes from DiffSynth-Studio on ModelScope.

https://modelscope.cn/organization/DiffSynth-Studio

----------------

Wan2.1-1.3b-lora-aesthetics-v1:

Model Overview

This LoRA model is trained based on the Wan2.1-1.3B model using the DiffSynth-Studio framework. It has been finely tuned on an aesthetics-focused dataset, enhancing the visual appeal of generated videos. Additionally, the classifier-free guidance can be disabled to speed up the process.

Recommended Settings
cfg_scale = 1

sigma_shift = 10

Note
Using this model may reduce the diversity of generated videos. It is recommended to adjust the lora_alpha value to fine-tune the LoRA’s impact on the final output.

----------------

Wan2.1-1.3b-lora-speedcontrol-v1:

Model Overview

This LoRA model is trained from the Wan2.1-1.3B model using the DiffSynth-Studio framework. It has been fine-tuned on an aesthetics-focused dataset, enhancing the visual appearance of generated videos. Additionally, the classifier-free guidance can be disabled to speed up the process.

Recommended parameters:

cfg_scale = 1

sigma_shift = 10

Note:
After using this model, the diversity of generated videos may decrease. It is recommended to adjust the lora_alpha value to control the impact of the LoRA on the final output.

Wan2.1-1.3b-lora-speedcontrol-v1:
This LoRA model is based on the Wan2.1-1.3B model and has been trained using the DiffSynth-Studio framework.
It allows control over the speed of generated videos by adjusting the LoRA alpha parameter.

LoRA alpha > 0: Use the low speed trigger → Slower speed, improved image quality.

LoRA alpha < 0: Use the high speed trigger → Faster speed, reduced image quality.

Currently, the effects of this model are not yet fully stable, and optimizations are still in progress.

Model Results
Prompt Used:
"A documentary-style photograph: a lively little white dog runs swiftly across a lush green lawn. Its fur is bright white, its two ears stand upright, and it has a focused yet joyful expression. Sunlight illuminates its coat, making it look particularly soft and shiny. In the background, a vast meadow scattered with a few wildflowers stretches toward the horizon, where a blue sky with a few white clouds can be seen. The perspective is dynamic, capturing the dog's movement and the surrounding energy of the grass. Side view, medium shot, moving camera."

Negative Prompt:
Overly bright colors, overexposure, static, blurry details, subtitles, artistic style, painting, still image, grayish tint, very poor quality, low quality, JPEG compression artifacts, ugly, deformed, extra fingers, poorly drawn hands, distorted faces, disfigured, malformed limbs, fused fingers, static scene, cluttered background, three legs, crowd in the background, characters walking upside down.

Effect of LoRA Alpha Parameter:
LoRA alpha = 0.7 → Slower speed, better visual quality.

LoRA alpha = 0 → Normal speed, neutral effect.

LoRA alpha = -0.5 → Faster speed, reduced visual quality.

----------------

Wan2.1-1.3b-lora-highresfix-v1:

Model Overview

This LoRA model is trained based on the Wan2.1-1.3B model using the DiffSynth-Studio framework. Since the base model was originally trained at 480p resolution, it has certain limitations in sharpness. To address this, additional training was conducted to enhance the quality of high-resolution videos, preventing image collapse or a dull appearance.

Recommended Usage
Directly generate short high-resolution videos:

Set the resolution to 1024 × 1024 while slightly reducing the number of frames to avoid excessively long generation times.

Refine details in a high-resolution video:

First, generate a low-resolution video.

Apply upscaling to increase resolution.

Finally, use this model to enhance visual details.

Model Effects
Anime / 2D Style
Prompt: Anime style, an adorable 2D-style girl with short black hair flowing in the wind, gently turning her head.

Negative Prompt: Overly bright colors, overexposure, static, blurry details, subtitles, artistic style, painting, still image, dull overall tone, poor quality, visible JPEG compression, ugly, incomplete, extra fingers, poorly drawn hands, malformed face, deformed, disfigured, distorted limbs, fused fingers, static image, cluttered background, three legs, crowd in the background, walking upside down.

Before activating LoRA → After activating LoRA

Sword and Magic
Prompt: An ancient mythology scene depicting a battle between a hero and a dragon, with steep cliffs in the background. The hero wears armor, wields a shining sword, and the dragon spreads its massive wings, ready to unleash flames.

Negative Prompt: Overly bright colors, overexposure, static, blurry details, subtitles, artistic style, painting, still image, dull overall tone, poor quality, visible JPEG compression, ugly, incomplete, extra fingers, poorly drawn hands, malformed face, deformed, disfigured, distorted limbs, fused fingers, static image, cluttered background, three legs, crowd in the background, walking upside down.

----------------

Wan2.1-1.3b-lora-exvideo-v1:

Model Overview

This LoRA model is trained based on the Wan2.1-1.3B model using the DiffSynth-Studio framework.
It enables video duration extension: once activated, this LoRA allows the generation of videos twice as long as usual.

Recommended Settings
num_frames = 161

lora_alpha = 1.0

Model Effects
📷 Documentary Photography Style
Prompt: A playful little dog wearing black sunglasses runs swiftly across a green lawn. Its fur is light brown, its ears perked up, and its expression is focused yet joyful. The sunlight highlights its fur, making it appear particularly soft and shiny. In the background, a vast meadow dotted with a few wildflowers stretches under a blue sky with scattered white clouds. The perspective is dynamic, capturing the motion of the dog's run and the vibrancy of the surrounding landscape. Side-moving camera, medium shot.

🎨 High-Definition 3D Texture
Prompt: A small white cat sprints forward on a 10-meter-high platform, then performs a backflip dive into the water. Its fur is silky, its gaze sharp, and its movements fluid and natural. In the background, a pristine blue swimming pool with a smooth and calm surface. At the moment of the jump, a spotlight from above illuminates the cat, creating a striking contrast between light and shadow. The water splashes are sharp and precise, producing a visually spectacular effect. C4D rendering, dynamic close-up.

🎭 Japanese Anime Style
Prompt: On a city street corner, a black cat crouches under a lamppost, gazing into the distance at the neon lights. Suddenly, a blue light beam descends from the sky, swiftly enveloping its body. The cat begins to levitate, its black fur slowly dissolving into the air as its body elongates. Its fur transforms into a sleek black suit, revealing a slender silhouette. Its cat ears disappear, and its facial features become human, taking on the appearance of a handsome young man with a cold gaze. He lands lightly, his suit billowing slightly in the night breeze, as the blue light fades away—an elegant and mysterious young man from the future.

🌆 Wide-Angle City Scene
Prompt: The camera provides an overview of a bustling city street. On a wide sidewalk, pedestrians move about, creating a lively and dynamic urban tableau.

Usage Instructions
This LoRA model is designed to extend video duration while maintaining visual quality.
For optimal results, set num_frames to 161 or adjust it according to your needs.