YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Xiang_Handsome Text-to-Video Generation

This repository contains the necessary steps and scripts to generate videos using the Xiang_Handsome text-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

Ubuntu (or a compatible Linux distribution) • Python 3.xpip (Python package manager) • GitGit LFS (Git Large File Storage) • FFmpeg

Installation

  1. Update and Install Dependencies

    sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
    
  2. Clone the Repository

    git clone https://huggingface.co/svjack/Xiang_Handsome_wan_2_1_1_3_B_text2video_lora 
    cd Xiang_Handsome_wan_2_1_1_3_B_text2video_lora 
    
  3. Install Python Dependencies

    pip install torch torchvision
    pip install -r requirements.txt
    pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
    pip install moviepy==1.0.3
    pip install sageattention==1.0.6
    
  4. Download Model Weights

    wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/models_t5_umt5-xxl-enc-bf16.pth
    wget https://huggingface.co/DeepBeepMeep/Wan2.1/resolve/main/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
    wget https://huggingface.co/Wan-AI/Wan2.1-T2V-14B/resolve/main/Wan2.1_VAE.pth
    wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_bf16.safetensors
    wget https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_14B_bf16.safetensors
    

Usage

To generate a video, use the wan_generate_video.py script with the appropriate parameters. Below are examples of how to generate videos using the Xiang_Handsome model.

Burger

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Xiang_Handsome_outputs/Xiang_Handsome_w1_3_lora-000120.safetensors \
--lora_multiplier 1.0 \
--prompt "In the style of Xiang InfiniteYou Handsome , Xiang, a young person with short, black hair and glasses, facing the camera directly, eating a burger."

Burger with cream

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Xiang_Handsome_outputs/Xiang_Handsome_w1_3_lora-000120.safetensors \
--lora_multiplier 1.0 \
--prompt "In the style of Xiang InfiniteYou Handsome, Xiang, a young person with short, black hair and glasses, facing the camera directly, sinking his teeth into a towering, sauce-drizzled burger with unbridled enthusiasm. A dollop of creamy mayo clings to the corner of his lips as he pulls away, the crisp lettuce crunching audibly between bites. His eyes light up with each flavorful mouthful, the melted cheese stretching in gooey strands before snapping—pure, messy, delicious satisfaction."

Rose

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Xiang_Handsome_outputs/Xiang_Handsome_w1_3_lora-000120.safetensors \
--lora_multiplier 1.0 --seed 125 \
--prompt "In the style of Xiang InfiniteYou Handsome, Xiang, a young person with short, black hair and glasses, facing the camera directly, his hands gracefully cradling a lush bouquet of crimson roses. With a tender flick of his wrist, he sets the blossoms into a gentle sway, their velvety petals catching the light as they dance. A single rose petal drifts lazily through the air, while the rich floral scent seems to shimmer around him—an elegant, poetic moment frozen in time."

Office

python wan_generate_video.py --fp8 --task t2v-1.3B --video_size 480 832 --video_length 81 --infer_steps 35 \
--save_path save --output_type both \
--dit wan2.1_t2v_1.3B_bf16.safetensors --vae Wan2.1_VAE.pth \
--t5 models_t5_umt5-xxl-enc-bf16.pth \
--attn_mode torch \
--lora_weight Xiang_Handsome_outputs/Xiang_Handsome_w1_3_lora-000120.safetensors \
--lora_multiplier 1.0 --seed 25 \
--prompt "In the style of Xiang InfiniteYou Handsome, Xiang, a young person with short, black hair and glasses, stands in a quiet office space. The soft glow of a desk lamp casts a warm light across his thoughtful expression, while the hum of distant keyboards and the faint scent of coffee linger in the air. Outside the window, the city lights twinkle like distant stars, blending with the muted glow of computer screens as the workday stretches on around him."

Parameters

  • --fp8: Enable FP8 precision (optional).
  • --task: Specify the task (e.g., t2v-1.3B).
  • --video_size: Set the resolution of the generated video (e.g., 1024 1024).
  • --video_length: Define the length of the video in frames.
  • --infer_steps: Number of inference steps.
  • --save_path: Directory to save the generated video.
  • --output_type: Output type (e.g., both for video and frames).
  • --dit: Path to the diffusion model weights.
  • --vae: Path to the VAE model weights.
  • --t5: Path to the T5 model weights.
  • --attn_mode: Attention mode (e.g., torch).
  • --lora_weight: Path to the LoRA weights.
  • --lora_multiplier: Multiplier for LoRA weights.
  • --prompt: Textual prompt for video generation.

Output

The generated video and frames will be saved in the specified save_path directory.

Troubleshooting

• Ensure all dependencies are correctly installed. • Verify that the model weights are downloaded and placed in the correct locations. • Check for any missing Python packages and install them using pip.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Hugging Face for hosting the model weights. • Wan-AI for providing the pre-trained models. • DeepBeepMeep for contributing to the model weights.

Contact

For any questions or issues, please open an issue on the repository or contact the maintainer.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support