File size: 6,075 Bytes

284a9cd

---
license: apache-2.0
---
# Wanxiao 2.1-1.3B-LoRA-Speed-Control-v1

## Model Introduction

This model is trained based on the [Wanxiao 2.1-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) model and the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. Using a structure similar to T2I-Adapter, an additional motion speed encoder is introduced to incorporate the *motion bucket id* for controlling the magnitude of motion. The encoder adopts a RoPE encoding + MLP architecture. The *motion bucket id* is calculated based on the quantiles of the standard deviation of latents along the temporal axis, mapped to a control range of 0–100.

* **motion bucket id = 1**: Slower motion, enhanced visual quality  
* **motion bucket id = 100**: Faster motion, reduced visual quality  

## Model Performance

**Prompt**: Documentary photography style, an energetic puppy rapidly running on lush green grass. The puppy has brownish-yellow fur, upright ears, and an expression that is focused and joyful. Sunlight shines on its body, making its fur appear exceptionally soft and shiny. The background features an open grassland with occasional wildflowers, and in the distance, a faint view of blue sky and scattered clouds. Strong perspective emphasizes the dynamic movement of the running puppy and the vitality of the surrounding grass. Medium shot with a side-moving viewpoint.

**Negative Prompt**: Vivid colors, overexposure, static, blurry details, subtitles, style, artwork, painting, stillness, overall grayish tone, worst quality, low quality, JPEG compression artifacts, ugly, defective, extra fingers, poorly drawn hands, poorly drawn face, deformed limbs, fused fingers, motionless frames, cluttered background, three legs, crowded background people, walking backwards.

**Example 1** (seed=1, left: motion bucket id=1, right: motion bucket id=100)

<div align="center"><video width="80%" controls><source src="video_merged_1.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>

**Example 2** (seed=2, left: motion bucket id=1, right: motion bucket id=100)

<div align="center"><video width="80%" controls><source src="video_merged_2.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>

**Example 3** (seed=3, left: motion bucket id=1, right: motion bucket id=100)

<div align="center"><video width="80%" controls><source src="video_merged_2.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>

## Usage Instructions

This model is trained using the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. Please install it first:

```
pip install diffsynth
```

```python
import torch
from diffsynth import ModelManager, WanVideoPipeline, save_video, VideoData
from modelscope import snapshot_download


# Download models
snapshot_download("Wan-AI/Wan2.1-T2V-1.3B", local_dir="models/Wan-AI/Wan2.1-T2V-1.3B")
snapshot_download("DiffSynth-Studio/Wan2.1-1.3b-speedcontrol-v1", local_dir="models/DiffSynth-Studio/Wan2.1-1.3b-speedcontrol-v1")

# Load models
model_manager = ModelManager(device="cpu")
model_manager.load_models(
    [
        "models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
        "models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
        "models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
        "models/DiffSynth-Studio/Wan2.1-1.3b-speedcontrol-v1/model.safetensors",
    ],
    torch_dtype=torch.bfloat16, # You can set `torch_dtype=torch.float8_e4m3fn` to enable FP8 quantization.
)
pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
pipe.enable_vram_management(num_persistent_param_in_dit=None)

# Text-to-video
video = pipe(
    prompt="Documentary photography style scene: a lively little dog rapidly running on a green grassy field. The dog has a brownish-yellow coat, upright ears, and an expression of focus and joy. Sunlight shines on its body, making its fur appear exceptionally soft and shiny. The background is an open grassland, occasionally dotted with a few wildflowers, with a faint view of blue sky and some white clouds in the distance. Strong sense of perspective captures the dynamic motion of the running dog and the vitality of the surrounding grass. Medium shot with a side-moving viewpoint.",
    negative_prompt="vivid colors, overexposed, static, blurry details, subtitles, style, artwork, painting, frame, still, overall grayish, worst quality, low quality, JPEG compression artifacts, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn face, deformed, disfigured, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards",
    num_inference_steps=50,
    seed=1, tiled=True,
    motion_bucket_id=0
)
save_video(video, "video_slow.mp4", fps=15, quality=5)

video = pipe(
    prompt="Documentary photography style scene: a lively little dog rapidly running on a green grassy field. The dog has a brownish-yellow coat, upright ears, and an expression of focus and joy. Sunlight shines on its body, making its fur appear exceptionally soft and shiny. The background is an open grassland, occasionally dotted with a few wildflowers, with a faint view of blue sky and some white clouds in the distance. Strong sense of perspective captures the dynamic motion of the running dog and the vitality of the surrounding grass. Medium shot with a side-moving viewpoint.",
    negative_prompt="vivid colors, overexposed, static, blurry details, subtitles, style, artwork, painting, frame, still, overall grayish, worst quality, low quality, JPEG compression artifacts, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn face, deformed, disfigured, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards",
    num_inference_steps=50,
    seed=1, tiled=True,
    motion_bucket_id=100
)
save_video(video, "video_fast.mp4", fps=15, quality=5)
```